Forum Stats

  • 3,741,452 Users
  • 2,248,430 Discussions
  • 7,861,817 Comments

Discussions

Incremental (or any other way) large INSERT

2»

Answers

  • Unknown
    edited Mar 8, 2016 6:15PM
    It will, but it will also use a lot more resources in doing so.
    
    http://docs.oracle.com/cd/E11882_01/server.112/e22490/ldr_modes.htm#SUTIL1329
    "
    By avoiding index maintenance, you minimize the amount of space required during a direct path load, in the following ways:
    
    You can build indexes one at a time, reducing the amount of sort (temporary) segment space that would otherwise be needed for each index.
    Only one index segment exists when an index is built, instead of the three segments that temporarily exist when the new keys are merged into the old index to make the new index.
    
    Avoiding index maintenance is quite reasonable when the number of rows to be loaded is large compared to the size of the table
    "
    To me, dropping the indexes is the safer and less worrisome proposition.
    

    There is nothing 'unsafe' or 'worrisome' in doing the index maintenance rather than dropping them.

    That doc quote ONLY refers to the extra space that is needed.

    Data added by direct path is NOT accessible by existing users or even the user adding the data until it is committed.

    Indexes are NOT rebuilt until AFTER all data has been loaded to the table.

    Oracle can then use one new segment to create mini-indexes for the new data. Then the merge of the old index segment/data and the new is done to that third segment in a way that allows users to continue to access the indexes while Oracle works with them.

    And for an offline operation it is often MUCH faster to do the maintenance rather than drop and recreate multiple indexes. That is especially true for large tables. The maintenance only needs to sort the small amount of new data and then do a merge whereas the drop/rebuilt has to deal with ALL of the data for ALL of the indexes.

  • Tubby
    Tubby Member Posts: 6,987 Bronze Crown
    edited Mar 8, 2016 6:54PM
    rp0428 wrote:
    There is nothing 'unsafe' or 'worrisome' in doing the index maintenance rather than dropping them. 
    

    Very much not true. We aren't there to see what's happening, we don't know very much about the OP's system. Based on what we don't know VS what we do know I'd feel a lot warmer inside recommending the OP drop the indexes and recreate them.

    I never said it was unsafe or worrisome from a technical standpoint IF you know everything you would need to know in order to make that assessment. If you're comfortable that you know everything you would need to know based on the contents of this thread then you're certainly welcome to make your own recommendations to the OP.

    Cheers,

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited Mar 9, 2016 3:01AM
    Tubby wrote:
    
    
    rp0428 wrote:
    There is nothing 'unsafe' or 'worrisome' in doing the index maintenance rather than dropping them. 
    
    
    Very much not true. We aren't there to see what's happening, we don't know very much about the OP's system. Based on what we don't know VS what we do know I'd feel a lot warmer inside recommending the OP drop the indexes and recreate them.
    
    I never said it was unsafe or worrisome from a technical standpoint IF you know everything you would need to know in order to make that assessment. If you're comfortable that you know everything you would need to know based on the contents of this thread then you're certainly welcome to make your own recommendations to the OP.
    
    Cheers, 
    

    Exactly, we don't know what's going on, we don't know if index maintenance is actually playing a big deal or if it would play a big deal if the dml was parallelized.    

    Dropping an index is not safe, dropping all indexes on a table seems reckless. What happens when another session tries to query the table? It'll have to do a full table scan, what if many sessions try to query the table ? A lot of full tables scans and this is while the parallel insert is taking up a lot of resources as it is. What if one of the indexes is unique? You now have the possibility of bad data being inserted which would take manual work to fix.

    What if the time taking to rebuild an index is longer than the time it takes to insert the data keeping the index? My demo previously confirms it's possible .

  • Chris Hunt
    Chris Hunt Member Posts: 2,066 Gold Trophy
    edited Mar 9, 2016 6:11AM
    Avoiding index maintenance is quite reasonable when the number of rows to be loaded is large compared to the size of the table
    

    In this case we're adding 7M rows to a 160M row table, so the number of rows to be added is less that 5% of the size of the table. Dropping and recreating the indexes is unlikely to help here.

    I also wouldn't entirely dismiss the idea of doing the insert in a number of stages. It's true that the overall time taken will be longer if you do it that way, but if your resources in terms of redo/rollback space are limited, and/or you only have a certain time window in which to do the transfer, it can still be a valid approach to take. That is assuming that the data in the source table is not being changed between iterations, and that there's a good indexed/partitioned column to use to identify the records to copy each time around.

    I think I'd try an insert append first, though.

    rade.todorovich
  • rade.todorovich
    rade.todorovich Member Posts: 1,306
    edited Mar 9, 2016 10:23AM

    Thanks Andrew,

    I think in my case I was experiencing few minutes to create an index, comparing to hours (or in this strange case days) for the insert. So if I drop indexes and then do insert that will take 1-2h or less, it would then make sense to spend one more hour (or less) to recreate indexes.

  • AndrewSayer
    AndrewSayer Member Posts: 12,998 Gold Crown
    edited Mar 9, 2016 10:41AM
    rade.todorovich wrote:
    
    Thanks Andrew,
    
    I think in my case I was experiencing few minutes to create an index, comparing to hours (or in this strange case days) for the insert. So if I drop indexes and then do insert that will take 1-2h or less, it would then make sense to spend one more hour (or less) to recreate indexes.
    

    But you need to confirm that dropping the indexes will improve the time by that amount. Different indexes will have different effects on the time taken.

  • Billy Verreynne
    Billy Verreynne Software Engineer Member Posts: 28,470 Red Diamond
    edited Mar 10, 2016 1:40AM

    Keep in mind that indexes are not the only overheads. Foreign keys, triggers, and constraints can also add significant overheads.

    So too the physical aspects of the update - e.g. when the high watermark leaves insufficient space for the additional space required in a data block for adding the updated data, or ITL contention, etc.

    The most important thing when dealing with a performance issue is to correctly identify the problem. The WHAT that is causing the performance issue. Dropping/invalidating indexes will be of no use if the performance issue is primarily due to a FK constraint to a parent table that uses a large index range scan to maintain data integrity.

  • rade.todorovich
    rade.todorovich Member Posts: 1,306
    edited Mar 10, 2016 1:31PM

    John,

    I was finally able to do this, with direct path insert, it took about 1200s, not bad. I think I figured out what may have caused the problem with conventional insert. I am going to open another thread.

This discussion has been closed.