11 Replies Latest reply on Jul 24, 2020 6:45 PM by user8983130

    best way to remove millions of rows

    user8983130

      Dear All,

       

      I have a table with around 60 millions of rows, It contains data from 2008. Now, I have decided to remove data upto 2018,31st of December.There is a column with date which i can use to delete rows. For example,delete from table where column_to_be_del < sysdate -365; The problem is if i want to delete all data at a time it will generate huge number of undo/redo files which the system can't handle. Or, I can try to delete small portion of data in every weekend so that less undo/redo generates. I can figure out these 2 options.

       

      Can you suggest if there are better ways to do the task?? For exmaple, a procedure to delete rows which will commit every 100000 of rows.

       

      Regards,

        • 1. Re: best way to remove millions of rows
          Jonathan Lewis

          Before anyone can answer that question they'd need to know which version of Oracle, whether or not it's Enterprise edition or Standard edition, and whether or not you're licenced for the Partiiton option.

           

           

          Regards

          Jonathan Lewis

          • 2. Re: best way to remove millions of rows
            L. Fernigrini

            First of all, follow Jonathan suggestion and provide more information. Remembre to follow this guidelines:

            Re: 2. How do I ask a question on the forums?

             

             

            Regarding your question and assuming:

             

            1) That you have a similar amount of data every year

            2) That you have no dependencies to that table (no foregin keys)

             

            If you have 12.5 years of data and you want to delete 11 of them (keep 1.5) it may be worth considering just copying that data to a new table, drop the old one and rename the new one as the old, and reconstruct additional things like constraints, indexes, etc. Inserting 7.5 million rows (more or less 1.5 year) is something that an Oracle database can handle.

             

            If you DO have references, then the work is more complicated (you need to reconstruct them after renaming) , but then your process of just deleting as you mentioned is also incomplete, you need to handle those data dependencies.

            1 person found this helpful
            • 3. Re: best way to remove millions of rows
              user8983130

              I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

              • 4. Re: best way to remove millions of rows
                user8983130

                The main reason to remove data is releasing space from the database. It's occupying near about 800GB of database space.

                • 5. Re: best way to remove millions of rows
                  user8983130

                  The data is storing in rows  using xml tags.

                  • 6. Re: best way to remove millions of rows
                    Tubby

                    user8983130 wrote:

                     

                    I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

                    Given that I would say your best bet is likely to

                    0) lock the table (ensure that other sessions aren't allowed to perform DML while you are doing this maintenance, so do whatever you need to do to accomplish that)

                    1) create a table with the rows you need to keep from the existing table

                    2) truncate existing table

                    3) insert (append) from table in step #1 into existing table

                    4) if everything is good drop the table created in step #1

                     

                    I'd recommend that over dropping the existing table to avoid having grant issues; and since this is a vendored application you want to do as little as possible that may cause issues with it. Maybe schedule this process to happen annually?

                     

                    Cheers,

                    1 person found this helpful
                    • 7. Re: best way to remove millions of rows
                      Paulzip

                      I've had similar business constraints in the past and I've approached it in a few ways

                      1. Use create table as select for the rows you want to keep, migrate indexes, triggers, constraints etc, drop old table, rename new table to old

                      2. Use partitioning, convert to partitioned table maybe partitioned by date, then you can drop partitions or take them offline.

                      3. Use DBMS_Parallel_Execute and chunk the delete up by ROWID, run it over the weekend - can be easily restarted.

                      • 8. Re: best way to remove millions of rows
                        Jonathan Lewis

                        12.2, Enterprise, with partitioning is the 2nd best place to be.

                         

                        You can conver a heap table into a partitioned table, excluding data you don't want to keep, maintaining index, all online in one command.

                        In your case you could create a hash partitioned table with one partition, and it should behave the same way as far as the optimizer is concerned as a non-partitioned table,

                         

                        See https://jonathanlewis.wordpress.com/2017/06/09/12-2-partitions/  for an example that converts a simple heap table into a list partitioned table - it covers all the high points.

                         

                        Note - the "including rows where" clause - you don't wast any resources copying (with redo and undo) any rows you don't want.

                         

                        L. Fernigrini's comment was a good one, by the way - can you make a good estimate of how much data you will have left - it's 1.5 years out of 12.5, but if the business has been growing steadily then it's not necessarily the case that you're keeping only 3/25ths (1.5/12.5) of the data, so it would be good to know how much of the data will disappear and how much remain.

                         

                         

                        Regards

                        Jonathan Lewis

                         

                         

                        UPDATE:  2nd best because I thought it wasn't until 18c or 19c that you could do "alter table t1 move including rows where coly >= {constant} online" - but it does actually work in 12.2 (though I'm still trying to find the place in the documentation that says it's supposed to work).

                         

                        UPDATE 2: It is in the manual. Here's a reference to the 12.2 SQL Reference manual "Alter table" - https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9… ,

                        follow down to "move_table_clause" https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9…

                        then down to "filter_condition"  https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9…

                        1 person found this helpful
                        • 9. Re: best way to remove millions of rows
                          EdStevens

                          user8983130 wrote:

                           

                          I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

                          App vendors are notoriously ignorant of the capabilities of the databases they build on. I once had a vendor tell me that they didn't recommend Oracle because "it can't easily support more than three concurrent connections."

                           

                          Is your estimated 800gb of released space going to be permanent, resulting from a one-time delete followed by regular periodic housekeeping?  Or is it the result of the regular periodic housekeeping?  If the latter, it's not worth doing because you will just have to re-aquire the space very soon.

                          • 10. Re: best way to remove millions of rows
                            user8983130

                            It's going to be regular activity, like once in a year.

                            • 11. Re: best way to remove millions of rows
                              user8983130

                              thank you all for your kind responses. It did really enhance my knowledge regarding data archival/purging activity.

                               

                              However,in my situation,it's little tricky since there are 4 tables among where we do have parent-child relationship,but i would definitely test on my UAT db regarding all the steps.