Forum Stats

  • 3,727,090 Users
  • 2,245,319 Discussions
  • 7,852,584 Comments

Discussions

best way to remove millions of rows

user8983130
user8983130 Member Posts: 353 Bronze Badge
edited July 2020 in SQL & PL/SQL

Dear All,

I have a table with around 60 millions of rows, It contains data from 2008. Now, I have decided to remove data upto 2018,31st of December.There is a column with date which i can use to delete rows. For example,delete from table where column_to_be_del < sysdate -365; The problem is if i want to delete all data at a time it will generate huge number of undo/redo files which the system can't handle. Or, I can try to delete small portion of data in every weekend so that less undo/redo generates. I can figure out these 2 options.

Can you suggest if there are better ways to do the task?? For exmaple, a procedure to delete rows which will commit every 100000 of rows.

Regards,

user8983130

Answers

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,573 Gold Crown
    edited July 2020

    Before anyone can answer that question they'd need to know which version of Oracle, whether or not it's Enterprise edition or Standard edition, and whether or not you're licenced for the Partiiton option.

    Regards

    Jonathan Lewis

  • L. Fernigrini
    L. Fernigrini Data Engineer Sr Consultant Member Posts: 3,529 Bronze Crown
    edited July 2020

    First of all, follow Jonathan suggestion and provide more information. Remembre to follow this guidelines:

    Regarding your question and assuming:

    1) That you have a similar amount of data every year

    2) That you have no dependencies to that table (no foregin keys)

    If you have 12.5 years of data and you want to delete 11 of them (keep 1.5) it may be worth considering just copying that data to a new table, drop the old one and rename the new one as the old, and reconstruct additional things like constraints, indexes, etc. Inserting 7.5 million rows (more or less 1.5 year) is something that an Oracle database can handle.

    If you DO have references, then the work is more complicated (you need to reconstruct them after renaming) , but then your process of just deleting as you mentioned is also incomplete, you need to handle those data dependencies.

    user8983130user8983130
  • user8983130
    user8983130 Member Posts: 353 Bronze Badge
    edited July 2020

    I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

  • user8983130
    user8983130 Member Posts: 353 Bronze Badge
    edited July 2020

    The main reason to remove data is releasing space from the database. It's occupying near about 800GB of database space.

  • user8983130
    user8983130 Member Posts: 353 Bronze Badge
    edited July 2020

    The data is storing in rows  using xml tags.

  • Tubby
    Tubby Member Posts: 6,987 Bronze Crown
    edited July 2020
    user8983130 wrote:I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

    Given that I would say your best bet is likely to

    0) lock the table (ensure that other sessions aren't allowed to perform DML while you are doing this maintenance, so do whatever you need to do to accomplish that)

    1) create a table with the rows you need to keep from the existing table

    2) truncate existing table

    3) insert (append) from table in step #1 into existing table

    4) if everything is good drop the table created in step #1

    I'd recommend that over dropping the existing table to avoid having grant issues; and since this is a vendored application you want to do as little as possible that may cause issues with it. Maybe schedule this process to happen annually?

    Cheers,

    user8983130
  • Paulzip
    Paulzip Member Posts: 8,227 Gold Crown
    edited July 2020

    I've had similar business constraints in the past and I've approached it in a few ways

    1. Use create table as select for the rows you want to keep, migrate indexes, triggers, constraints etc, drop old table, rename new table to old

    2. Use partitioning, convert to partitioned table maybe partitioned by date, then you can drop partitions or take them offline.

    3. Use DBMS_Parallel_Execute and chunk the delete up by ROWID, run it over the weekend - can be easily restarted.

    user8983130
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,573 Gold Crown
    edited July 2020

    12.2, Enterprise, with partitioning is the 2nd best place to be.

    You can conver a heap table into a partitioned table, excluding data you don't want to keep, maintaining index, all online in one command.

    In your case you could create a hash partitioned table with one partition, and it should behave the same way as far as the optimizer is concerned as a non-partitioned table,

    See https://jonathanlewis.wordpress.com/2017/06/09/12-2-partitions/  for an example that converts a simple heap table into a list partitioned table - it covers all the high points.

    Note - the "including rows where" clause - you don't wast any resources copying (with redo and undo) any rows you don't want.

    L. Fernigrini's comment was a good one, by the way - can you make a good estimate of how much data you will have left - it's 1.5 years out of 12.5, but if the business has been growing steadily then it's not necessarily the case that you're keeping only 3/25ths (1.5/12.5) of the data, so it would be good to know how much of the data will disappear and how much remain.

    Regards

    Jonathan Lewis

    UPDATE:  2nd best because I thought it wasn't until 18c or 19c that you could do "alter table t1 move including rows where coly >= {constant} online" - but it does actually work in 12.2 (though I'm still trying to find the place in the documentation that says it's supposed to work).

    UPDATE 2: It is in the manual. Here's a reference to the 12.2 SQL Reference manual "Alter table" - https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9… ,

    follow down to "move_table_clause" https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9…

    then down to "filter_condition"  https://docs.oracle.com/en/database/oracle/oracle-database/12.2/sqlrf/ALTER-TABLE.html#GUID-552E7373-BF93-477D-9DA3-B2C9…

    user8983130user8983130
  • EdStevens
    EdStevens Member Posts: 28,024 Gold Crown
    edited July 2020
    user8983130 wrote:I am glad to hear from you. The db version is 12.2.0.1 and it's enterprise edition with partitioning option. However, the app. vendor didn't give recommendation to partition of the table,instaed, suggested us to delete rows and then do the table move and index rebuild things. Additinally, the table is storing data that is not suitable for partitioning accoring to vendor.

    App vendors are notoriously ignorant of the capabilities of the databases they build on. I once had a vendor tell me that they didn't recommend Oracle because "it can't easily support more than three concurrent connections."

    Is your estimated 800gb of released space going to be permanent, resulting from a one-time delete followed by regular periodic housekeeping?  Or is it the result of the regular periodic housekeeping?  If the latter, it's not worth doing because you will just have to re-aquire the space very soon.

  • user8983130
    user8983130 Member Posts: 353 Bronze Badge
    edited July 2020

    It's going to be regular activity, like once in a year.

  • user8983130
    user8983130 Member Posts: 353 Bronze Badge
    edited July 2020

    thank you all for your kind responses. It did really enhance my knowledge regarding data archival/purging activity.

    However,in my situation,it's little tricky since there are 4 tables among where we do have parent-child relationship,but i would definitely test on my UAT db regarding all the steps.

Sign In or Register to comment.