Forum Stats

  • 3,752,640 Users
  • 2,250,531 Discussions
  • 7,867,903 Comments

Discussions

What is the best way to delete millions of row from large table?

Sevdimali Isayev
Sevdimali Isayev Member Posts: 123 Red Ribbon
edited Jul 24, 2018 11:47AM in SQL & PL/SQL

Hi everyone,

I have large table it has more than 5 million rows.

Table not partitioned.

on average 50-60 rows inserted in every second to my table.

it is 12 column (one blob and one clob) and three indexes on my table.

\

what is the best way to delete rows from this table?

If any other information needed please let me know.

Tagged:
Sevdimali IsayevSven W.BrunoVromanSeánMacGCJonathan Lewis

Best Answer

  • BrunoVroman
    BrunoVroman Member Posts: 1,848 Silver Crown
    edited Jul 23, 2018 8:47AM Accepted Answer

    Oh, by the way:

    more than 5 million rows.  on average 50-60 rows inserted in every second

    Hmm, 55 rows / second * 86400 seconds / day = 4752000 rows per day... Does the table contains data of only one day (+/-)? And maybe (depending on the rows that you have to delete and the fact that you have to repeat the operation often or not -maybe daily?): The table seems a good candidate for partitioning, maybe you can change the "I have to delete millions of rows" into "I have to drop (a) partition(s)"

    Best regards,

    Bruno Vroman.

    Sevdimali IsayevSevdimali IsayevSven W.Jonathan Lewis
«13

Answers

  • BEDE
    BEDE Oracle Developer Member Posts: 2,277 Gold Trophy
    edited Jul 23, 2018 8:17AM

    The main concern should be: are there any foreign keys referring the table from which you desire to delete? If the answer is yes, then it's going to be really rough on the database. Otherwise a simple delete from x_table where ... should do.

    If there are 5M rows to delete, how many rows will there still remain in the table? If you delete more than some, say 25% of the rows in table_x, maybe it's beter to do something like below:

    create table table_x_del as select * from table_x where ...; --- the where condition should include all the rows you will keep

    drop table_x;

    rename table_x_del to table_x;

    Of course, this will not work on a system where table_x is to be accessed by users.

    And there is one more thing that is important: do you have to delete all those records in one single transaction? If the answer is yes then a simple delete from tbale_x where... is the only solution.

    Otherwise there may be variations like using dbms_parallel_execute.

  • Cookiemonster76
    Cookiemonster76 Member Posts: 3,410
    edited Jul 23, 2018 8:16AM

    If you go the CTAS route remember it doesn't copy constraints or triggers.

  • BEDE
    BEDE Oracle Developer Member Posts: 2,277 Gold Trophy
    edited Jul 23, 2018 8:24AM

    Right! Those are to be created afterwards. And the same goes for indexes.

    Just to simplify, it would be better:

    create table table_x_del as select * from table_x where ...; --- filter only what the rows you want to keep

    truncate table table_x;

    insert into table_x select * from table_x_del;

    drop table table_x_del;

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    edited Jul 23, 2018 8:20AM

    If you are deleting a large percentage of the data, I'd probably consider CTAS approach, but you'll have to migrate constraints, triggers, indexes etc.

    If you're not keen on this approach (which in some situations is perfectly justifiable) I would go the DBMS_PARALLEL_EXECUTE route.  It's precisely the sort of situation it was designed for,

  • Sevdimali Isayev
    Sevdimali Isayev Member Posts: 123 Red Ribbon
    edited Jul 23, 2018 8:24AM

    Hi, thansk for reply.

    Not able to choose ctas method, because i have many applications which insert this table, if I will drop this table they got an exceptions.. which is forbidden.

  • BrunoVroman
    BrunoVroman Member Posts: 1,848 Silver Crown
    edited Jul 23, 2018 8:39AM

    Hello,

    about "CTAS": and privileges.  Be fully aware that the CTAS way means some application downtime (no new data should be inserted/modified in the table before the CTAS starts)

    Alternative: expdp with a filter on rows to keep, then drop table (if expdp is successful) and impdp it back. (obviously this also  requires "application downtime"; the advantage of the alternative is that this will take care of constraints, indexes, privileges, triggers)

    Bes tregards,

    Bruno Vroman.

    Edited: clarification after Paulzip's remark.

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    edited Jul 23, 2018 8:26AM
    Bruno.Vroman wrote:Hello,about "CTAS": and privileges. Be fully aware that the CTAS way means some application downtime (no new data should be inserted/modified in the table before the CTAS starts)Alternative: expdp with a filter on rows to keep, then drop table (if expdp is successful) and impdp it back.Bes tregards,Bruno Vroman.

    Not going to work in a live system that cannot be offline.

  • Paulzip
    Paulzip Member Posts: 8,423 Blue Diamond
    edited Jul 23, 2018 8:28AM

    Example of DBMS_PARALLEL_EXECUTE...

    declare

      l_sql_stmt VARCHAR2(1000);

      l_try NUMBER := 0;

      l_status NUMBER;

    begin

      DBMS_PARALLEL_EXECUTE.CREATE_TASK ('MYTASK');  -- Create the TASK

      DBMS_PARALLEL_EXECUTE.CREATE_CHUNKS_BY_ROWID('MYTASK', 'MY_SCHEMA', 'MY_TABLE', true, 100);  -- Chunk the table by ROWID

      -- Execute the DML in parallel

      l_sql_stmt := 'delete from MY_TABLE WHERE rowid BETWEEN :start_id AND :end_id'; -- Add whatever other predicates you are filtering on in the where clause here

      DBMS_PARALLEL_EXECUTE.RUN_TASK('MYTASK', l_sql_stmt, DBMS_SQL.NATIVE, parallel_level => 10);

      L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('MYTASK');

      while (l_try < 2 and L_status != DBMS_PARALLEL_EXECUTE.FINISHED)  -- If there is an error, RESUME it for at most 2 times.

      loop

        L_try := l_try + 1;

        DBMS_PARALLEL_EXECUTE.RESUME_TASK('MYTASK');

        L_status := DBMS_PARALLEL_EXECUTE.TASK_STATUS('MYTASK');

      end loop;

      DBMS_PARALLEL_EXECUTE.DROP_TASK('MYTASK'); -- Done with processing; drop the task

      commit;

    end;

    /

  • Joerg.Sobottka
    Joerg.Sobottka Senior Consultant and Oracle Ace Member Posts: 595 Bronze Trophy
    edited Jul 23, 2018 8:31AM

    5 Million isn't a large table. If 50-60 rows are added per second, just delete, what you don't need anymore. In my opinion if the LOBs are not very big, it even does not make to shrink the table as the space will be allocated again very soon.

  • BrunoVroman
    BrunoVroman Member Posts: 1,848 Silver Crown
    edited Jul 23, 2018 8:47AM Accepted Answer

    Oh, by the way:

    more than 5 million rows.  on average 50-60 rows inserted in every second

    Hmm, 55 rows / second * 86400 seconds / day = 4752000 rows per day... Does the table contains data of only one day (+/-)? And maybe (depending on the rows that you have to delete and the fact that you have to repeat the operation often or not -maybe daily?): The table seems a good candidate for partitioning, maybe you can change the "I have to delete millions of rows" into "I have to drop (a) partition(s)"

    Best regards,

    Bruno Vroman.

    Sevdimali IsayevSevdimali IsayevSven W.Jonathan Lewis
This discussion has been closed.