Oracle Analytics Publisher

Welcome to the Oracle Analytics Community: Please complete your User Profile and upload your Profile Picture

Query for calculating CHECKSUM value of a result set

Received Response
290
Views
10
Comments
Hitesh Jasuja-Oracle
Hitesh Jasuja-Oracle Rank 2 - Community Beginner
edited November 2024 in Oracle Analytics Publisher

We have a business requirement to send the checksum (SHA-256) value of an outbound data extract via BIP Bursting for a third-party system to reconcile once they receive the file at their end

If anyone has worked on this type of requirement, please provide your valuable inputs

Answers

  • Hitesh Jasuja-Oracle
    Hitesh Jasuja-Oracle Rank 2 - Community Beginner

    Thanks Mandeep for sharing the links but we are looking for kind of HASHVALUE generated for data generated out of BIP . Links talk more on detecting changes in a table.

  • MandeepGupta
    MandeepGupta Rank 7 - Analytics Coach

    Can you please elaborate a bit more on your requirement? For example, you are pulling 10 data fields for a query consisting of 10 tables, then you want an additional column for CHECKSUM (11th column). What will it be used for?

    Is this a full file extract or changes only?

    Thanks.

  • Hitesh Jasuja-Oracle
    Hitesh Jasuja-Oracle Rank 2 - Community Beginner

    Our requirement is to have a data file pushed to Downstream application via Bursting to SFTP.Along with data file customer wants to have an associated control file where some columns with static values are required and CHECKSUM is of of those columns where expectation is to have checksum (SHA-256) value of an data file extract pushed in data file.

  • Monish Munot-Oracle
    Monish Munot-Oracle Rank 1 - Community Starter

    Hitesh - One of the way we can do this is using a Java or Python code.

  • Dave Kekish
    Dave Kekish Rank 5 - Community Champion

    What we do is the following:

    Import the extract to a staging table and calculate a hash (sha-256) there. That hash along with the primary keys (things change if the keys can be null of course) are used in a T-Sql merge statement in our primary warehouse, so we are only updating/touching rows that actually have a changed value in them in addition to inserting new rows.

    I can elaborate further if you have a need.

    -Dave

  • Hitesh Jasuja-Oracle
    Hitesh Jasuja-Oracle Rank 2 - Community Beginner

    Hi Dave,

    Could you please elaborate further on this or share the use case

  • Dave Kekish
    Dave Kekish Rank 5 - Community Champion

    I don't know about a use case, its just how we handle importing the extracted PVO datasets into our local copy.

    to elaborate:

    Let's pick a relatively simply pvo FscmTopModelAM.FinExtractAM.GlBiccExtractAM.BalanceExtractPVO

    we import that data every 2 hours by grabbing the csv/pecsv files off the ucm.

    For each csv file, we import that to a staging table with automatically calculated a sha-256 hash for the tow. After the data is loaded to that staging table, we run a merge stored procedure, which looks something like

    MERGE DataWarehouse.Oracle.FscmTopModelAM_FinExtractAM_GlBiccExtractAM_BalanceExtractPVO t
    USING
    (
    SELECT *
    FROM #TempFscmTopModelAM_FinExtractAM_GlBiccExtractAM_BalanceExtractPVO_Data
    ) s
    ON (
    t.BalanceCurrencyCode = s.BalanceCurrencyCode
    AND t.BalanceLedgerId = s.BalanceLedgerId
    AND t.BalanceActualFlag = s.BalanceActualFlag
    AND t.BalancePeriodName = s.BalancePeriodName
    AND ((t.BalanceTranslatedFlag = s.BalanceTranslatedFlag) OR (t.BalanceTranslatedFlag IS NULL AND s.BalanceTranslatedFlag IS NULL))
    AND t.BalanceCodeCombinationId = s.BalanceCodeCombinationId
    )
    WHEN MATCHED AND t.RowHash <> s.RowHash THEN
    UPDATE SET t.BalanceProjectToDateDrBeq = s.BalanceProjectToDateDrBeq

    That way if the data hasn't changed, we don't actually update the record.

    This one is a little odd in that one of the "keys" can be null, but the principle is the same for the 200+ pvo's and BIP reports we import on a daily basis, not to mention all the other imports we run.

  • Hitesh Jasuja-Oracle
    Hitesh Jasuja-Oracle Rank 2 - Community Beginner

    Thanks Dave,

    Could you please share the sample code for BIP in case you have as out use cases belongs to BIP.

  • Dave Kekish
    Dave Kekish Rank 5 - Community Champion

    we don't do anything special for BIP report extractions, they run the exact same way. all the hash values are done on the local database side, not the cloud.

    we literally have all of the metadata for a report or a pvo in a table and simply generate tables and stored procedures to handle the mergers (along with some index work). the hash values are generated and persisted in sql within the staging tables, and they are stored on a row-by-row basis in the warehouse for comparison.

    Considering the volume of data we process on a daily basis from oracle alone, its fairly efficient. I don't know what the use case if for the OP's need, but I would think something similar could be done. Trying to manage an extract and a secondary report to ensure you get the same dataset values would imo be a frustrating exercise. Generate the hash based on the data you actually have in hand.