7 Replies Latest reply: Jan 14, 2013 7:19 AM by Richard Harrison . RSS

    Data Archiving  - Datapump ?

    user621430
      Hi,

      In order to reduce the volume of data in our main Transactional tables, we are planning to create a new Archive table and move data to the Archive tables.

      I will run this job once in six months to pull old data from my Transaction table into my Archive table -
      Essentially the three steps
      - Find/Select data in transactional table using the Transaction_Date column
      - Insert the selected data into Archive table
      - Delete the selected data from Transactional table

      I was thinking of writing a PL/SQL block, but came across Oracles Datapump and was wondering if Datapump would be a better approach - especially where you need to load in to a target and also delete from the source.

      If you have other ideas, feel free to suggest them too.


      Thanks!
      Anand
        • 1. Re: Data Archiving  - Datapump ?
          Richard Harrison .
          Hi,
          Partitioning the source table and then possibly using partition exchange is probably the best solution if you have a partitioning licence. I'm not sure datapump is the right technology for what you want to do here - it might possibly be if you want to archive to a different database - if not I'd just be using SQL to do what you want to do.

          Regards,
          Harry
          • 2. Re: Data Archiving  - Datapump ?
            user621430
            Thanks for the response Harry.

            Is there an option in Datapump export to delete the exported rows from the source table?
            • 3. Re: Data Archiving  - Datapump ?
              Richard Harrison .
              Hi,
              Datapump does no changes to the 'source' where it extracts from - so it won't be able to delete any data. You'd have to manually delete data after any extract you did.

              Regards,
              Harry
              • 4. Re: Data Archiving  - Datapump ?
                user621430
                Thank you. I was considering SQL too, but that would mean all the three steps are done at a single step - Select from source , Insert into Archive, Delete from source.

                With something like a Datapump, I can split the process in to two steps and hence reduce the downtime of the production site.

                Excuse my ignorance about datapumps, but is import better than plain inserts in terms of performance?

                Thanks!
                • 5. Re: Data Archiving  - Datapump ?
                  Richard Harrison .
                  Hi,
                  Almost always plain old SQL is the fastest way of doing anything in the database. Datapump is very optimized to be as fast as possible but a CTAS or insert append parallel statement should always be faster than a datapump import statement i would say. I don't see that you should really need any outage on the system to do what you want to? I assume the old data is not changed once created so rows can be moved to the archive table without affecting the running system (well not noticeably anyway. You should do it in just 2 statements:

                  1) insert into archive select x,x,x from source; ( no seperate select first and don't use pl/sql cursors....)
                  2) delete from source

                  If you know your data won;t change then this is fine - if it can change you will need to lock the data in some way or at least do the insert/delete inside the same plsql block.

                  Cheers,
                  Harry
                  • 6. Re: Data Archiving  - Datapump ?
                    Dean Gagne
                    Harry,

                    I know you said "almost always"... I want to just add one comment. Data Pump will try to use direct path to move data which will be faster than insert as select.

                    Thanks

                    Dean
                    • 7. Re: Data Archiving  - Datapump ?
                      Richard Harrison .
                      Hi Dean,
                      Adding the APPEND hint should make the insert run in direct path mode also - but by default it won't you are correct.

                      Cheers,
                      Harry