Forum Stats

  • 3,851,765 Users
  • 2,264,024 Discussions
  • 7,904,843 Comments

Discussions

Need to delete rows fast

Uday_N
Uday_N Member Posts: 320 Bronze Badge

Hi All,

I need to delete a million records fastly . I have tried a few options but all taking time . I have provided the example

Table :

CREATE TABLE SOURCE_TABLE ( ID NUMBER, CREATE_DATE DATE) ;

CREATE TABLE TARGET_TABLE ( TG_ID NUMBER , ID NUMBER , CREATE_DATE DATE);

My scenario:

I need to delete target_table records based on the id and today's date ;

Volume of data in target table to be deleted is 108444 . It takes a lot of time more than 2-3 hours . Some time at backend data has been deleted but session out is coming . Total records in the target table is 228997

Query I tried :

delete from target_table where trunc(create_date ) = trunc(sysdate) and id in (select id from source_table) ;

Query 2 I tried :

delete from target_table where trunc(create_date ) = trunc(sysdate) and id in (select id from source_table) ;

Query 3 I tried:

set serveroutput on

declare

cursor c1 is

select * from source_table;

begin

for i in c1 loop

delete from target_table where trunc(create_date) = trunc(sysdate) and id = i.id

end loop ;

end ;


In 1st and second query it is taking more than 2 hours time while in 3rd query it took more than 5 hours and session was timed out but deleted at the back end .


Is there any way to quickly delete the records ? I didnt try forall delete . should i try that as well ? Please kind your advice ?


Regards,

Uday

Tagged:

Best Answers

  • Uday_N
    Uday_N Member Posts: 320 Bronze Badge
    Answer ✓

    Hi Paul,

    Thanks for your suggestion . I will apply now . Thanks once again

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,480 Red Diamond
    Answer ✓

    Hi, @Uday_N

    Always try to use the raw column from your table rather than some expression based on the column. As Paulzip coded it above,

    where create_date >= trunc(sysdate) and 
          create_date < trunc(sysdate) + 1
    

    is likely to be a lot more efficient than

    where trunc(create_date ) = trunc(sysdate) 
    

    because it allows the optimizer to use an index on create_date. Also, it requires fewer function calls. TRUNC (SYSDATE) only needs to be evaluated once, no matter how big the table is, but TRUNC (create_date) needs to be called for every row. If you have over 200,000 rows in the table, that could be 200,000 function calls.

    Uday_N
  • User_H3J7U
    User_H3J7U Member Posts: 1,062 Gold Trophy
    edited Dec 27, 2021 3:06PM Answer ✓

    Total records in the target table is 228997 and to be deleted is 108444

    The table (number,number,date) will consume 1-10MB, 50% rows to be deleted. The index on the target_table is not needed.

«1

Answers

  • Paulzip
    Paulzip Member Posts: 8,748 Blue Diamond

    Add an index on target_table (id, create_date)

    Then try this...

    delete from target_table 
    where create_date >= trunc(sysdate) and 
          create_date < trunc(sysdate) + 1 and 
          id in (select id from source_table) ;
    

    This will facilitate index usage as you won't be applying a function on create_date

    Frank Kulash
  • Uday_N
    Uday_N Member Posts: 320 Bronze Badge
    Answer ✓

    Hi Paul,

    Thanks for your suggestion . I will apply now . Thanks once again

  • Frank Kulash
    Frank Kulash Member, Moderator Posts: 42,480 Red Diamond
    Answer ✓

    Hi, @Uday_N

    Always try to use the raw column from your table rather than some expression based on the column. As Paulzip coded it above,

    where create_date >= trunc(sysdate) and 
          create_date < trunc(sysdate) + 1
    

    is likely to be a lot more efficient than

    where trunc(create_date ) = trunc(sysdate) 
    

    because it allows the optimizer to use an index on create_date. Also, it requires fewer function calls. TRUNC (SYSDATE) only needs to be evaluated once, no matter how big the table is, but TRUNC (create_date) needs to be called for every row. If you have over 200,000 rows in the table, that could be 200,000 function calls.

    Uday_N
  • User_H3J7U
    User_H3J7U Member Posts: 1,062 Gold Trophy
    edited Dec 27, 2021 3:06PM Answer ✓

    Total records in the target table is 228997 and to be deleted is 108444

    The table (number,number,date) will consume 1-10MB, 50% rows to be deleted. The index on the target_table is not needed.

  • Uday_N
    Uday_N Member Posts: 320 Bronze Badge

    Hi Frank,

    Thanks for the suggestions . Your guidance always helped me . Will henceforth apply it.

    Hi H37ju,

    Thanks for your suggestion . It was kind of you to reply to me


    Regards,

    Uday

  • mathguy
    mathguy Member Posts: 10,670 Blue Diamond

    If you want help, you must give accurate information. When you go to the doctor, you don't tell him your back is hurting when you have a leg injury, do you?

    "I need to delete a million records" vs "volume of data to be deleted is 108444". Which one should we believe? We can't believe both, obviously.

    Then - you say the tables are "examples". That's bad. What is the structure of your ACTUAL tables? "Something went over my leg, for example a bicycle". When in reality it was a truck. Sorry, can't give you good answers if we have incomplete (or wrong) information.

    Before you jump to any solutions, try to diagnose the problem. Find out what is slowing down your DELETE. It may be that "selecting" the rows to delete is taking a long time. You didn't tell us anything about the SOURCE table; if it is very large, checking the id's may take a very long time. An index on the id column in THAT table may help.

    Much more likely, when you delete rows in a simple scenario like you have, what is taking a long time is the actual "delete" part - writing the undo segments and the redo logs. Not so much if the table structure is as you showed us, but more likely if the tables have many more columns.

    Another thing to keep an eye on is whatever may be blocking / delaying your operation. Rows locked by other operations, waiting for those other operations to complete. (So, your own DELETE is not slow; it just must wait in line.) If that's the issue, then you are barking up the wrong tree. Figure out what else is going on in the database at the same time. (But, first, find out if THAT is the issue and not something else.) If you really only have to delete about 100k rows from a table with about 200k rows, then this would be the most likely cause IMO.

    You also didn't tell us about the location of different things. Is this all in one database, running on one server? Or are you working over a slow network, with the two tables in different databases on different servers, etc.?

    Perhaps you can also explain the difference between Query 1 and Query 2; I didn't see any.

  • Uday_N
    Uday_N Member Posts: 320 Bronze Badge

    Hi Math ,

    sure . Thanks for your detail information . I will check what is on causing the issue . I am also very sorry for not able to give the details of the actual tables . I mean it as I have huge respect for the people in this group . I will look what causing the delay and will apprise in the group . Thanks a lot .

  • Dave_VZ
    Dave_VZ Member Posts: 47 Red Ribbon

    @mathguy Hi, I'm an old dinosaur from Oracle 6 days, and I sympathize with both you and the asker. The message came across to me as derisive, probably my misunderstanding and no offense intended. What I'd like to contribute is that there are numerous circumstances in which accuracy may be illegal: HIPPA and US DoD work specifically, and non-disclosure or confidentiality agreements in general. As Uday discovers redaction techniques I trust his message will improve.

    @Frank Kulash In my outmoded ways I still think of the compound condition (A AND B) as requiring two passes against the result set. If that's still true, I speculate that the condition drops to one I/O pass if the condition becomes "x BETWEEN trunc(sysdate) AND trunc(sysdate) + 1". If I recall correctly, the BETWEEN conditions are also inclusive, which eliminates the extra I/O required for the GE qualifier. Food for thought.

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,722 Red Diamond
    edited Dec 28, 2021 1:54AM

    @Dave_VZ In my outmoded ways I still think of the compound condition (A AND B) as requiring two passes against the result set.

    No, there will be a single pass where each row is read once and condition A (or B - optimizer decides on predicate order) is checked first and if result is TRUE then second condition is checked. That's if we do table full scan. In case of index one or both conditions will be checked against index.

    Operator BETWEEN is nothing but a "shortcut" and X BETWEEN Y AND Z results in X>= Y AND X <= Z. Just check the plan:

    SQL> explain plan for select * from emp where sal between 1000 and 2000;
    
    Explained.
    
    SQL> select * from table(dbms_xplan.display);
    
    PLAN_TABLE_OUTPUT
    ---------------------------------------------------------------------------
    Plan hash value: 3956160932
    
    --------------------------------------------------------------------------
    | Id  | Operation         | Name | Rows  | Bytes | Cost (%CPU)| Time     |
    --------------------------------------------------------------------------
    |   0 | SELECT STATEMENT  |      |     6 |   228 |     3   (0)| 00:00:01 |
    |*  1 |  TABLE ACCESS FULL| EMP  |     6 |   228 |     3   (0)| 00:00:01 |
    --------------------------------------------------------------------------
    
    Predicate Information (identified by operation id):
    ---------------------------------------------------
    
    PLAN_TABLE_OUTPUT
    ----------------------------------------------------------------------------
    
       1 - filter("SAL"<=2000 AND "SAL">=1000)
    
    13 rows selected.
    
    SQL>
    

    Now to OP's "Volume of data in target table to be deleted is 108444 . It takes a lot of time more than 2-3 hours". I see two possible causes:

    1. Some other session(s) lock(s) to be deleted row(s) causing DELETE to wait.
    2. Some other session(s) are modifying SOURCE_TABLE causing DELETE to spend time digging into UNDO.

    OP needs to run AWR (or trace DELETE).

    SY.

  • Solomon Yakobson
    Solomon Yakobson Member Posts: 19,722 Red Diamond

    Actually I missed another rather simple possibility - table SOURCE_TABLE is very large and there is no index on column ID. In such case optimizer most likely will do hash join of TARGET_TABLE and SOURCE_TABLE which can take a while. OP needs to provide explain plan.

    SY.