1 2 Previous Next 22 Replies Latest reply on Aug 7, 2013 3:21 AM by Ishan

    How to query only updated or new rows in Oracle

    81103cac-5cd0-4c3b-844c-88c0208f0f90

      In my project I need to run a cron job - that will query Table at midnight and create data objects out of the selected rows.

      This table will grow in size over a period of time - so intention is to only query (select) - those rows that are new OR rows that have been updated since the previous run.

      Is there any way in oracle to do this.

      regards

        • 1. Re: How to query only updated or new rows in Oracle
          Pacmann

          Hi,

           

          You should add and maintain a lastupdate_date and create_date in your table.

           

          Because if you try to rely on another technical solution, it might not work, for example :

           

          SELECT *

          FROM yourtable

          MINUS

          SELECT *

          FROM yourtable AS OF TIMESTAMP date_of_last_run

           

          This solution would raise ORA-1555 snapshot too old if old version has disappeared from you undo.

          • 2. Re: How to query only updated or new rows in Oracle
            Frank Kulash

            Hi,

             

            Oracle does not automatically keep track of when a row was inserted or updated; you have to keep track of that yourself.

            One way to do that is to include a modified_date column in the table.  You can write a trigger so that this column will always be populated, even if that column is not mentioned in the INSERT or UPDATE statement.  For example:

             

            CREATE OR REPLACE TRIGGER   table_x_biu

            BEFORE INSERT OR UPDATE ON  table_x

            FOR EACH ROW

            BEGIN

                :NEW.modified_date := SYSDATE;

            END;

            /

            SHOW ERRORS

             

             

            In a separate table, record when the cron job ran.  Getting the MAX run_date will be a pretty fast operation, then you can use that value to find rows whose modified_date value is greater.

            • 3. Re: How to query only updated or new rows in Oracle
              Ishan

              If you are looking for exact answer which would further mean any ORACLE utility which would do that for you, I am afraid there isn't any. This is a complete design thing. As rightly suggested by Pacmann, you need to add CREATED_DATE and MODIFIED_DATE as two new columns to achieve your objective.

               

              There could be further cases to your scenario viz.

               

              1) How many times the update can happen on the same row?

              2) If multiple updates are happening within the same day, how would you note the difference between the two updates.

               

              And so on and so forth..

               

              But the basic idea remains the same, "Add these columns".

               

              Thanks,

              Ishan

              • 4. Re: How to query only updated or new rows in Oracle
                Mohamed Houri

                I would have created a historical table in which I would have directed any new insert or update or delete done against my target table . There is an example of how to do that here

                 

                Audit | Search Results | Mohamed Houri’s Oracle Notes

                 

                You could then query the historical table as from the previous run of your ovenight cron job

                 

                select   empno

                          , ename

                          , job

                          , sal

                          , substr(dml_usr,1,10) usr

                          , substr(dml_pgm,1,10) pgm

                          , substr(dml_dat,1,12) dat

                          , dml_operation        dml

                    from emp_history

                    where dml_dat >= last_run_date

                    and dml_operation in ('U','I','D'); 


                As such you could track multiples updates done on the same record in between two job runs


                Best regards

                Mohamed Houri


                 

                Message was edited by: BluShadow Blog link removed from post - Such links are not permitted on the forums

                • 5. Re: How to query only updated or new rows in Oracle
                  Jonathan Lewis

                  A solution using AS AT SCN is technically the safest - although you have highlighted the main problem, you have to configure your database to guarantee the availability of the necessary undo. More significantly, though, the solution is not scalable as the work done is proportional to the total amount of data in the table rather than the required amount of data.

                   

                  Regards

                  Jonathan Lewis


                  • 6. Re: How to query only updated or new rows in Oracle
                    Jonathan Lewis

                    Won't work as stated because of read-consistency.

                     

                    Assume I insert a row that triggers a date stamp of 23:59, but don't commit.

                    You run the cron job at 00:00

                    I commit at 00:01

                     

                    You won't get my row in this run because the change wasn't committed, you won't get it in the next run because the timestamp will be earlier than the timestamp of the cron job,

                     

                    Regards

                    Jonathan Lewis

                    • 7. Re: How to query only updated or new rows in Oracle
                      Jonathan Lewis

                      Standard requirement, but gets messier than you might expect.

                      Possibly the most scalable solution is:

                       

                      create a logging table

                      create a trigger to insert a row into the logging table on insert or update

                       

                      cron job

                      updates a flag column on every row in the table it can see

                      processes the rows in the log table that are flagged

                      deletes the flagged rows in the log table

                       

                      You do have to deal with the logic of how you (want to) handle

                      a) multiple log records for the same base row

                      b) log records which point to a row which gets updated AFTER you've flagged an existing log record but BEFORE you've processed the log (or original) record.

                       

                      EDIT: Alternatively you might look at Advanced Queueing (AQ) - have the trigger enqueue a logging record, and have the cron job dequeue all the logging records it can see.

                      Regards

                      Jonathan Lewis

                      • 8. Re: How to query only updated or new rows in Oracle
                        rp0428

                         

                        In my project I need to run a cron job - that will query Table at midnight and create data objects out of the selected rows.

                        This table will grow in size over a period of time - so intention is to only query (select) - those rows that are new OR rows that have been updated since the previous run.

                        Is there any way in oracle to do this.

                        First - the 'solution' advocated by Pacmann, FrankKulash and Ishan is WRONG, WRONG, WRONG. The idea that you can use columns such as CREATED_DATE and MODIFIED_DATE, update them to SYSDATE using a trigger and then reliably extract chronological sets of data based on those columns is a MYTH that has been busted many, many times in this forum and in others.

                         

                        Except in very limited circumstances that approach can FAIL to extract some data altogether. See my detailed explanation in my reply dated Apr 17, 2013 3:01 PM in this thread from a few months ago.

                        https://forums.oracle.com/message/10970661?tstart=0

                         

                        >

                        I think the chronological order of records should be using a timestamp (i.e. "order by created_date desc" etc.)

                        >

                        Not that old MYTH again! That has been busted so many times it's hard to believe anyone still wants to try to do that.

                        One way to query 'new or updated' data is to use ORA_ROWSCN and extract data based on the SCN of the data. The caveat here is that, by default, Oracle only tracks data at the block level and not the row level.

                         

                        See ORA_ROWSCN in the SQL Language doc

                        http://docs.oracle.com/cd/B28359_01/server.111/b28286/pseudocolumns007.htm

                         

                        ORA_ROWSCN Pseudocolumn

                        For each row, ORA_ROWSCN returns the conservative upper bound system change number (SCN) of the most recent change to the row. This pseudocolumn is useful for determining approximately when a row was last updated. It is not absolutely precise, because Oracle tracks SCNs by transaction committed for the block in which the row resides. You can obtain a more fine-grained approximation of the SCN by creating your tables with row-level dependency tracking. Refer to CREATE TABLE ... NOROWDEPENDENCIES | ROWDEPENDENCIES for more information on row-level dependency tracking.

                         

                        If you create the table with ROWDEPENDENCIES Oracle will track SCN at the row level.

                         

                        Your query can select data based on ORA_ROWSCN each night to extract data with a higher SCN than the previous extract. There is also a small caveat with this even when you use ROWDEPENDENCIES. Delayed block cleanout can cause the SCN to be slightly later than the change actually happened. For best accuracy you need to ensure that all block cleanout has been performed PRIOR to doing the extract. One way to do that is to force a full table scan.

                         

                        See a detailed explanation and example code in my reply in this thread

                        https://forums.oracle.com/thread/2527915

                         

                        Other ways to capture changes are to use a materialized view log, change data capture or log miner.

                        • 9. Re: How to query only updated or new rows in Oracle
                          Stew Ashton

                          Excellent post, however the ora_rowscn solution precludes the use of an index, so as Jonathan said above there could be scalability problems.

                           

                          To allow for the use of date or timestamp fields, Tom Kyte proposed "lock table ... in share mode". See

                          https://asktom.oracle.com/pls/apex/f?p=100:11:0::::P11_QUESTION_ID:16998677475837

                           

                          This ensures that there are no outstanding transactions with older dates than the report date.

                           

                          Later in the same thread, he suggests querying V$TRANSACTION to get the time of the oldest current transaction, and using that as the start date for the report.

                           

                          Personally, I would have used Change Data Capture if Oracle had not explicitly deprecated it...

                           

                          Message was edited by: StewAshton

                          • 10. Re: How to query only updated or new rows in Oracle
                            Ishan

                            RP,

                             

                            I don't think it's MYTH and this can never be implemented using CREATED/MODIFIED Date columns. I agree that it ORA_ROWSCN Pseudocolumn can be used here but I disagree that there is no other way than using ORA_ROWSCN Pseudocolumn to implement the same.

                             

                            The basic crux of your whole logic is various users updating the rows and comitting at different time which would eventually result in incorrect Chronological error and hence TRIGGERS and the above said columns are not correct. Correct me if I am wrong here, as my following logic is based on this inference

                             

                            I have a counter proposal to it.

                             

                            What if I take the value of the sysdate in a variable in the beginning of the code and then use that variable for updating my columns. So if I started updating at 2:01 pm then even if I commit at 4.00 pm it will still show timestamp as 2:01 pm only. No matter at what time the users commit, it would reflect the correct timesstamp of when user logged in first. It would handle the case what you described in your earlier posts.

                             

                            What say?

                            • 11. Re: How to query only updated or new rows in Oracle
                              Most Wanted!!!!

                              i am not that much expert in oracle this came to my mind please neglect it if this is a bad idea

                               

                              suppose ur table is  table_1

                               

                              before inserting in table_1

                               

                              create table table_1_backup as select * From table_1;

                               

                              and after inserting

                              select * From table_1

                              minus

                              select * from table_1_backup;

                               

                              regards

                              friend

                               

                              Message was edited by: mostwanted!!!!

                              • 12. Re: How to query only updated or new rows in Oracle
                                Stew Ashton

                                Your first solution is very expensive in terms of resources, and it doesn't distinguish between inserted data and updated data.

                                 

                                Your second "solution" supposes that newly inserted rows always have greater rowids. Not true. If you delete a row, space becomes free. Oracle can insert into the free space in an existing block instead of inserting into a new block. This would cause the new row to have a lower rowid than other older rows.

                                 

                                Best regards, Stew

                                 

                                Note: mostwanted has now deleted the second "solution" I referred to. Message was edited by: StewAshton

                                • 14. Re: How to query only updated or new rows in Oracle
                                  BluShadow

                                  StewAshton wrote:

                                  Personally, I would have used Change Data Capture if Oracle had not explicitly deprecated it...

                                   

                                  Oracle CDC wasn't that nice to configure, and it is still there in 11g, but they recommend not to use it...

                                   

                                  http://docs.oracle.com/cd/E11882_01/server.112/e25554/cdc.htm#DWHSG016

                                   

                                  Now replaced it seems with Oracle GoldenGate...

                                   

                                  Configuring Oracle GoldenGate

                                  1 2 Previous Next