This discussion is archived
1 2 3 4 Previous Next 46 Replies Latest reply: May 23, 2011 7:26 AM by 834371 Go to original post RSS
  • 15. Re: Long running update
    orawiss Oracle ACE
    Currently Being Moderated
    looks like you have a high CPU consumption
  • 16. Re: Long running update
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    855802 wrote:
    Hello,
    Can anyone explain why this simple update statement against a single partition in large table (~300,000 rows, ~1GB in size for the single partition) is taking very long time. The most unusual thing I see in the stats are HUGE number of buffer gets.
    PARTITION P_ESRODS_EXTR_CMG_TRAN_PRE2005 VALUES LESS THAN (TO_DATE(' 2005-01-01 00:00:00', 'SYYYY-MM-DD HH24:MI:SS', 'NLS_CALENDAR=GREGORIAN'))
    LOGGING
    COMPRESS
    TABLESPACE P_ESRODS_EXTR_TRANS_LARGE_DAT
    I see that your table definiiton has the COMPRESS option enabled. If the data has loaded compressed then it's gone in with PCTFREE 0 and any update to a row is almost certain to cause row migration. Since these updates happen many times your level of migrated rows could be high, and some of your "tablescans" could spend a lot of their time doing single block reads to jump to the migrated row (I'd have to check that hypothesis, it's not one I've tested before.)

    Once you introduce bulk migration in 11g, though, you also need to think about whether or not you're using ASSM, and what your block size is. There was a nasty bug that surfaced some time ago http://jonathanlewis.wordpress.com/2009/09/25/bugs/ that appeared if you combine ASSM with block sizes larger than 8KB and then do lots of row migration.

    Another poster suggested read-consistency - this is also a possibility. A simple test of this would be to run an example that doesn't take so long and take snapshots of the sessions statistics (v$sesstat or v$mystat) to see how many of your buffer gets equate to statistics names '% - undo record applied'.

    Regards
    Jonathan Lewis
  • 17. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    It's running now so I can monitor and gather data. I have one session full table scan taking forever.... notice this session already performed several table scans which happened quite quickly.

    statement:
    UPDATE ESRODS.EXTR_CMG_TRANSACTION_HISTORY
    SET RULE_ID_2 = '7800', REPORT_CODE = 'AP'
    WHERE CLEARING_DEALER_ID = '20212'
    AND SHARE_CLASS = 'W'
    AND PROCESS_DATE BETWEEN '01-DEC-2008' AND '31-DEC-2008'
    AND RULE_ID_2 IS NULL

    Plan
    UPDATE STATEMENT ALL_ROWSCost: 10,810                
         3 UPDATE ESRODS.EXTR_CMG_TRANSACTION_HISTORY           
              2 PARTITION RANGE SINGLE Cost: 10,810 Bytes: 14,400 Cardinality: 200 Partition #: 2 Partitions accessed #49     
                   1 TABLE ACCESS FULL TABLE ESRODS.EXTR_CMG_TRANSACTION_HISTORY Cost: 10,810 Bytes: 14,400 Cardinality: 200 Partition #: 3 Partitions accessed #49

    230 million buffer gets so far...

    Sesstats:
    INST_ID SID CLASS NAME VALUE
    1 760 User logons cumulative 1
    1 760 User logons current 1
    1 760 User opened cursors cumulative 1720
    1 760 User opened cursors current 3
    1 760 User user commits 1712
    1 760 User user rollbacks 0
    1 760 User user calls 6358
    1 760 User recursive calls 9461
    1 760 User recursive cpu usage 1024
    1 760 User session logical reads 304868069
    1 760 User session stored procedure space 0
    1 760 Debug CPU used when call started 106527
    1 760 User CPU used by this session 356927
    1 760 User DB time 261453
    1 760 User cluster wait time 0
    1 760 User concurrency wait time 8322
    1 760 User application wait time 0
    1 760 User user I/O wait time 715502
    1 760 User session connect time 1304787762
    1 760 Debug process last non-idle time 1304787762
    1 760 User session uga memory 1482672
    1 760 User session uga memory max 1482672
    1 760 Debug messages sent 10337
    1 760 Debug messages received 0
    1 760 Debug background timeouts 0
    1 760 User session pga memory 4657776
    1 760 User session pga memory max 6492784
    1 760 Enqueue enqueue timeouts 0
    1 760 Enqueue enqueue waits 0
    1 760 Enqueue enqueue deadlocks 0
    1 760 Enqueue enqueue requests 18892
    1 760 Enqueue enqueue conversions 0
    1 760 Enqueue enqueue releases 18888
    1 760 ParallelServer global enqueue gets sync 171742
    1 760 ParallelServer global enqueue gets async 0
    1 760 ParallelServer global enqueue get time 39
    1 760 ParallelServer global enqueue releases 171739
    1 760 Cache physical read total IO requests 6715342
    1 760 Cache physical read total multi block requests 95784
    1 760 Cache physical read total bytes 188494594048
    1 760 Cache physical write total IO requests 0
    1 760 Cache physical write total multi block requests 0
    1 760 Cache physical write total bytes 0
    1 760 SQL physical IO disk bytes 188494594048
    1 760 SQL cell physical IO interconnect bytes 188494594048
    1 760 Debug spare statistic 1 0
    1 760 Debug spare statistic 2 0
    1 760 Debug spare statistic 3 0
    1 760 Debug spare statistic 4 0
    1 760 ParallelServer IPC CPU used by this session 0
    1 760 ParallelServer gcs messages sent 0
    1 760 ParallelServer ges messages sent 0
    1 760 ParallelServer global enqueue CPU used by this session 0
    1 760 Cache db block gets 166753517
    1 760 Cache db block gets from cache 166753517
    1 760 Cache db block gets from cache (fastpath) 31508070
    1 760 Cache db block gets direct 0
    1 760 Cache consistent gets 138114552
    1 760 Cache consistent gets from cache 138114552
    1 760 Cache consistent gets from cache (fastpath) 5453256
    1 760 Cache consistent gets - examination 132352982
    1 760 Cache consistent gets direct 0
    1 760 Cache physical reads 11508558
    1 760 Cache physical reads cache 11508558
    1 760 Cache physical reads direct 0
    1 760 Cache physical read IO requests 6715342
    1 760 Cache physical read bytes 188494594048
    1 760 Cache db block changes 4468449
    1 760 Cache consistent changes 32205
    1 760 Cache recovery blocks read 0
    1 760 Cache recovery blocks read for lost write detection 0
    1 760 Cache recovery blocks skipped lost write checks 0
    1 760 Cache physical writes 0
    1 760 Cache physical writes direct 0
    1 760 Cache physical writes from cache 0
    1 760 Cache physical write IO requests 0
    1 760 Cache physical reads direct temporary tablespace 0
    1 760 Cache physical writes direct temporary tablespace 0
    1 760 Cache physical write bytes 0
    1 760 Cache db corrupt blocks detected 0
    1 760 Cache db corrupt blocks recovered 0
    1 760 Cache physical writes non checkpoint 0
    1 760 Cache summed dirty queue length 0
    1 760 Cache DBWR checkpoint buffers written 0
    1 760 Cache DBWR thread checkpoint buffers written 0
    1 760 Cache DBWR tablespace checkpoint buffers written 0
    1 760 Cache DBWR parallel query checkpoint buffers written 0
    1 760 Cache DBWR object drop buffers written 0
    1 760 Cache DBWR transaction table writes 0
    1 760 Cache DBWR undo block writes 0
    1 760 Cache DBWR revisited being-written buffer 0
    1 760 Cache DBWR lru scans 0
    1 760 Cache DBWR checkpoints 0
    1 760 ParallelServer & Cache DBWR fusion writes 0
    1 760 Cache prefetch clients - keep 0
    1 760 Cache prefetch clients - recycle 0
    1 760 Cache prefetch clients - default 0
    1 760 Cache prefetch clients - 2k 0
    1 760 Cache prefetch clients - 4k 0
    1 760 Cache prefetch clients - 8k 0
    1 760 Cache prefetch clients - 16k 0
    1 760 Cache prefetch clients - 32k 0
    1 760 Cache change write time 7925
    1 760 Cache redo synch writes 1712
    1 760 Cache redo synch time 2666
    1 760 Cache exchange deadlocks 0
    1 760 Cache free buffer requested 12382668
    1 760 Cache dirty buffers inspected 1192654
    1 760 Cache pinned buffers inspected 92651
    1 760 Cache hot buffers moved to head of LRU 7881084
    1 760 Cache free buffer inspected 13597328
    1 760 Cache commit cleanout failures: write disabled 0
    1 760 Cache commit cleanout failures: block lost 11013
    1 760 Cache commit cleanout failures: cannot pin 0
    1 760 Cache commit cleanout failures: hot backup in progress 0
    1 760 Cache commit cleanout failures: buffer being written 65
    1 760 Cache commit cleanout failures: callback failure 1558
    1 760 Cache commit cleanouts 37245
    1 760 Cache commit cleanouts successfully completed 24609
    1 760 Cache recovery array reads 0
    1 760 Cache recovery array read time 0
    1 760 Cache CR blocks created 2163
    1 760 Cache current blocks converted for CR 1644
    1 760 Cache switch current to new buffer 68833
    1 760 Cache write clones created in foreground 8240
    1 760 Cache write clones created in background 0
    1 760 Cache write clones created for recovery 0
    1 760 Cache recovery block gets from cache 0
    1 760 Cache physical reads cache prefetch 4793216
    1 760 Cache physical reads prefetch warmup 0
    1 760 Cache prefetched blocks aged out before use 98022
    1 760 Cache prefetch warmup blocks aged out before use 0
    1 760 Cache prefetch warmup blocks flushed out before use 0
    1 760 Cache physical reads retry corrupt 0
    1 760 Cache physical reads direct (lob) 0
    1 760 Cache physical writes direct (lob) 0
    1 760 Cache cold recycle reads 0
    1 760 Cache shared hash latch upgrades - no wait 7688
    1 760 Cache shared hash latch upgrades - wait 0
    1 760 Cache physical reads for flashback new 0
    1 760 Debug calls to kcmgcs 132265146
    1 760 Debug calls to kcmgrs 0
    1 760 Debug calls to kcmgas 160359
    1 760 ParallelServer calls to get snapshot scn: kcmgss 127282
    1 760 Redo redo blocks read for recovery 0
    1 760 Redo redo entries 2588072
    1 760 Redo redo size 1194274208
    1 760 Redo redo entries for lost write detection 0
    1 760 Redo redo size for lost write detection 0
    1 760 Redo redo size for direct writes 0
    1 760 Redo redo buffer allocation retries 66
    1 760 Redo redo wastage 0
    1 760 Redo redo writer latching time 0
    1 760 Redo redo writes 0
    1 760 Redo redo blocks written 0
    1 760 Redo redo write time 0
    1 760 Redo redo blocks checksummed by FG (exclusive) 0
    1 760 Redo redo blocks checksummed by LGWR 0
    1 760 Redo redo log space requests 126
    1 760 Redo redo log space wait time 176
    1 760 Redo redo ordering marks 60549
    1 760 Redo redo subscn max counts 64760
    1 760 Redo redo write broadcast ack time 0
    1 760 Redo redo k-bytes read total 0
    1 760 Redo redo k-bytes read (memory) 0
    1 760 Redo redo k-bytes read total by LNS 0
    1 760 Redo redo k-bytes read (memory) by LNS 0
    1 760 ParallelServer & Cache gc cr blocks served 0
    1 760 ParallelServer & Cache gc cr block build time 0
    1 760 ParallelServer & Cache gc cr block flush time 0
    1 760 ParallelServer & Cache gc cr block send time 0
    1 760 ParallelServer & Cache gc current blocks served 0
    1 760 ParallelServer & Cache gc current block pin time 0
    1 760 ParallelServer & Cache gc current block flush time 0
    1 760 ParallelServer & Cache gc current block send time 0
    1 760 ParallelServer & Cache gc cr blocks received 0
    1 760 ParallelServer & Cache gc cr block receive time 0
    1 760 ParallelServer & Cache gc current blocks received 0
    1 760 ParallelServer & Cache gc current block receive time 0
    1 760 ParallelServer & Cache gc local grants 12303281
    1 760 ParallelServer & Cache gc remote grants 0
    1 760 ParallelServer & Cache gc blocks lost 0
    1 760 ParallelServer & Cache gc claim blocks lost 0
    1 760 ParallelServer & Cache gc blocks corrupt 0
    1 760 ParallelServer & Cache gc CPU used by this session 3434
    1 760 ParallelServer & Cache gc reader bypass grants 0
    1 760 Cache total number of slots 0
    1 760 Cache Effective IO time 0
    1 760 Cache Number of read IOs issued 0
    1 760 Cache background checkpoints started 0
    1 760 Cache background checkpoints completed 0
    1 760 Cache number of map operations 0
    1 760 Cache number of map misses 0
    1 760 Redo flashback log writes 0
    1 760 SQL cell physical IO bytes saved during optimized file creation 0
    1 760 SQL cell physical IO bytes saved during optimized RMAN file restore 0
    1 760 SQL cell physical IO bytes eligible for predicate offload 0
    1 760 SQL & Cache Batched IO vector read count 0
    1 760 SQL & Cache Batched IO vector block count 0
    1 760 SQL & Cache Batched IO single block count 0
    1 760 SQL & Cache Batched IO zero block count 0
    1 760 SQL & Cache Batched IO block miss count 0
    1 760 SQL & Cache Batched IO double miss count 0
    1 760 SQL & Cache Batched IO (full) vector count 0
    1 760 SQL & Cache Batched IO (space) vector count 0
    1 760 SQL & Cache Batched IO (bound) vector count 0
    1 760 SQL & Cache Batched IO same unit count 0
    1 760 SQL & Cache Batched IO buffer defrag count 0
    1 760 SQL & Cache Batched IO slow jump count 0
    1 760 Debug shared io pool buffer get success 0
    1 760 Debug shared io pool buffer get failure 0
    1 760 User serializable aborts 0
    1 760 Debug transaction lock foreground requests 0
    1 760 Debug transaction lock foreground wait time 0
    1 760 Debug transaction lock background gets 0
    1 760 Debug transaction lock background get time 0
    1 760 Debug undo change vector size 542301668
    1 760 Debug transaction tables consistent reads - undo records applied 681
    1 760 Debug transaction tables consistent read rollbacks 15
    1 760 Debug data blocks consistent reads - undo records applied 31029
    1 760 Debug no work - consistent read gets 5076297
    1 760 Debug cleanouts only - consistent read gets 38946
    1 760 Debug rollbacks only - consistent read gets 2673
    1 760 Debug cleanouts and rollbacks - consistent read gets 1120
    1 760 Debug RowCR attempts 0
    1 760 Debug RowCR hits 0
    1 760 Debug RowCR - row contention 0
    1 760 Debug RowCR - resume 0
    1 760 Debug rollback changes - undo records applied 3
    1 760 Debug transaction rollbacks 3
    1 760 Debug immediate (CURRENT) block cleanout applications 11096
    1 760 Debug immediate (CR) block cleanout applications 40066
    1 760 Debug deferred (CURRENT) block cleanout applications 45122
    1 760 Debug commit txn count during cleanout 43136
    1 760 Debug active txn count during cleanout 132265920
    1 760 Debug cleanout - number of ktugct calls 132308286
    1 760 Debug immediate CR cleanouts (index blocks) 0
    1 760 Debug deferred CUR cleanouts (index blocks) 0
    1 760 Debug Commit SCN cached 0
    1 760 Debug Cached Commit SCN referenced 0
    1 760 Debug Block Cleanout Optim referenced 2
    1 760 Debug min active SCN optimization applied to CR 0
    1 760 Debug auto extends on undo tablespace 0
    1 760 Debug drop segment calls in space pressure 0
    1 760 Debug total number of undo segments dropped 0
    1 760 Debug doubling up with imu segment 0
    1 760 Debug tune down retentions in space pressure 0
    1 760 Debug steps of tune down ret. in space pressure 0
    1 760 Debug space was found by tune down 0
    1 760 Debug space was not found by tune down 0
    1 760 User commit batch/immediate requested 3
    1 760 User commit batch requested 0
    1 760 User commit immediate requested 3
    1 760 User commit batch/immediate performed 3
    1 760 User commit batch performed 0
    1 760 User commit immediate performed 3
    1 760 User commit wait/nowait requested 0
    1 760 User commit nowait requested 0
    1 760 User commit wait requested 0
    1 760 User commit wait/nowait performed 0
    1 760 User commit nowait performed 0
    1 760 User commit wait performed 0
    1 760 Debug global undo segment hints helped 0
    1 760 Debug global undo segment hints were stale 0
    1 760 Debug local undo segment hints helped 0
    1 760 Debug local undo segment hints were stale 0
    1 760 Debug undo segment header was pinned 0
    1 760 Debug IMU commits 0
    1 760 Debug IMU Flushes 0
    1 760 Debug IMU contention 0
    1 760 Debug IMU recursive-transaction flush 0
    1 760 Debug IMU undo retention flush 0
    1 760 Debug IMU ktichg flush 0
    1 760 Debug IMU bind flushes 0
    1 760 Debug IMU mbu flush 0
    1 760 Debug IMU pool not allocated 0
    1 760 Debug IMU CR rollbacks 0
    1 760 Debug IMU undo allocation size 0
    1 760 Debug IMU Redo allocation size 0
    1 760 Debug IMU- failed to get a private strand 0
    1 760 Debug Misses for writing mapping 0
    1 760 Debug TBS Extension: tasks created 0
    1 760 Debug TBS Extension: tasks executed 0
    1 760 Debug TBS Extension: files extended 0
    1 760 Debug total number of times SMON posted 0
    1 760 Debug SMON posted for undo segment recovery 0
    1 760 Debug SMON posted for txn recovery for other instances 0
    1 760 Debug SMON posted for instance recovery 0
    1 760 Debug SMON posted for undo segment shrink 0
    1 760 Debug SMON posted for dropping temp segment 0
    1 760 SQL table scans (short tables) 17
    1 760 SQL table scans (long tables) 76
    1 760 SQL table scans (rowid ranges) 0
    1 760 SQL table scans (cache partitions) 0
    1 760 SQL table scans (direct read) 0
    1 760 SQL table scan rows gotten 137155534
    1 760 SQL table scan blocks gotten 4893358
    1 760 SQL table fetch by rowid 816902
    1 760 SQL table fetch continued row 5893
    1 760 SQL cluster key scans 16
    1 760 SQL cluster key scan block gets 16
    1 760 SQL rows fetched via callback 0
    1 760 SQL cell scans 0
    1 760 Debug cell blocks processed by cache layer 0
    1 760 Debug cell blocks processed by txn layer 0
    1 760 Debug cell blocks processed by data layer 0
    1 760 Debug cell blocks processed by index layer 0
    1 760 SQL cell commit cache queries 0
    1 760 SQL cell transactions found in commit cache 0
    1 760 SQL cell blocks helped by commit cache 0
    1 760 SQL cell blocks helped by minscn optimization 0
    1 760 SQL cell blocks skipped due to chained rows 0
    1 760 cell simulated physical IO bytes eligible for predicate offload 0
    1 760 cell simulated physical IO bytes returned by predicate offload 0
    1 760 Debug queue update without cp update 0
    1 760 SQL index crx upgrade (prefetch) 0
    1 760 SQL index crx upgrade (found) 0
    1 760 SQL index crx upgrade (positioned) 0
    1 760 Debug leaf node splits 549
    1 760 Debug leaf node 90-10 splits 24
    1 760 Debug branch node splits 0
    1 760 Debug failed probes on index block reclamation 0
    1 760 Debug recursive aborts on index block reclamation 0
    1 760 SQL native hash arithmetic execute 0
    1 760 SQL native hash arithmetic fail 0
    1 760 Cache lob reads 0
    1 760 Cache lob writes 0
    1 760 Cache lob writes unaligned 0
    1 760 SQL index fast full scans (full) 0
    1 760 SQL index fast full scans (rowid ranges) 0
    1 760 SQL index fast full scans (direct read) 0
    1 760 Debug index fetch by key 1741
    1 760 Debug index scans kdiixs1 868
    1 760 Debug queue splits 0
    1 760 Debug queue flush 0
    1 760 Debug queue position update 0
    1 760 Debug queue single row 0
    1 760 Debug queue ocp pages 0
    1 760 Debug queue qno pages 0
    1 760 SQL heap block compress 142418
    1 760 SQL HSC OLTP Space Saving 22355231
    1 760 SQL HSC OLTP Compressed Blocks 3339
    1 760 SQL HSC IDL Compressed Blocks 0
    1 760 SQL HSC Compressed Segment Block Changes 1424073
    1 760 SQL HSC Heap Segment Block Changes 1425667
    1 760 SQL HSC OLTP Non Compressible Blocks 1.84467440737095E19
    1 760 Debug HSC OLTP positive compression 7759
    1 760 Debug HSC OLTP negative compression 0
    1 760 Debug HSC OLTP recursive compression 7759
    1 760 Debug HSC OLTP inline compression 0
    1 760 Debug HSC OLTP Drop Column 0
    1 760 Debug HSC OLTP Compression skipped rows 5714
    1 760 Debug HSC OLTP compression block checked 0
    1 760 SQL Heap Segment Array Inserts 0
    1 760 SQL Heap Segment Array Updates 11
    1 760 Debug securefile allocation bytes 0
    1 760 Debug securefile allocation chunks 0
    1 760 Debug securefile direct read bytes 0
    1 760 Debug securefile direct write bytes 0
    1 760 Debug securefile direct read ops 0
    1 760 Debug securefile direct write ops 0
    1 760 Debug securefile inode read time 0
    1 760 Debug securefile inode write time 0
    1 760 Debug securefile inode ioreap time 0
    1 760 Debug securefile bytes non-transformed 0
    1 760 Debug securefile number of non-transformed flushes 0
    1 760 Debug securefile bytes encrypted 0
    1 760 Debug securefile bytes cleartext 0
    1 760 Debug securefile compressed bytes 0
    1 760 Debug securefile uncompressed bytes 0
    1 760 Debug securefile bytes deduplicated 0
    1 760 Debug securefile create dedup set 0
    1 760 Debug securefile destroy dedup set 0
    1 760 Debug securefile add dedupd lob to set 0
    1 760 Debug securefile rmv from dedup set 0
    1 760 Debug securefile reject deduplication 0
    1 760 Debug securefile dedup prefix hash match 0
    1 760 Debug securefile number of flushes 0
    1 760 Debug securefile dedup flush too low 0
    1 760 Debug securefile dedup callback oper final 0
    1 760 Debug securefile dedup hash collision 0
    1 760 Debug securefile dedup fits inline 0
    1 760 SQL sql area purged 0
    1 760 SQL sql area evicted 2612
    1 760 SQL CCursor + sql area evicted 2
    1 760 Debug DX/BB enqueue lock foreground requests 0
    1 760 Debug DX/BB enqueue lock foreground wait time 0
    1 760 Debug DX/BB enqueue lock background gets 0
    1 760 Debug DX/BB enqueue lock background get time 0
    1 760 Clusterwide global transactions 0
    1 760 Clusterwide global transactions spanning RAC nodes 0
    1 760 Forwarded 2PC commands across RAC nodes 0
    1 760 ParallelServer GTX processes spawned by autotune 0
    1 760 ParallelServer GTX processes stopped by autotune 0
    1 760 SQL session cursor cache hits 122
    1 760 SQL session cursor cache count 50
    1 760 User java call heap total size 0
    1 760 User java call heap total size max 0
    1 760 User java call heap used size 0
    1 760 User java call heap used size max 0
    1 760 User java call heap live size 0
    1 760 User java call heap live size max 0
    1 760 User java call heap object count 0
    1 760 User java call heap object count max 0
    1 760 User java call heap live object count 0
    1 760 User java call heap live object count max 0
    1 760 User java call heap gc count 0
    1 760 User java call heap collected count 0
    1 760 User java call heap collected bytes 0
    1 760 User java session heap used size 0
    1 760 User java session heap used size max 0
    1 760 User java session heap live size 0
    1 760 User java session heap live size max 0
    1 760 User java session heap object count 0
    1 760 User java session heap object count max 0
    1 760 User java session heap live object count 0
    1 760 User java session heap live object count max 0
    1 760 User java session heap gc count 0
    1 760 User java session heap collected count 0
    1 760 User java session heap collected bytes 0
    1 760 Debug cursor authentications 119
    1 760 ParallelServer queries parallelized 0
    1 760 ParallelServer DML statements parallelized 0
    1 760 ParallelServer DDL statements parallelized 0
    1 760 ParallelServer DFO trees parallelized 0
    1 760 ParallelServer Parallel operations not downgraded 0
    1 760 ParallelServer Parallel operations downgraded to serial 0
    1 760 ParallelServer Parallel operations downgraded 75 to 99 pct 0
    1 760 ParallelServer Parallel operations downgraded 50 to 75 pct 0
    1 760 ParallelServer Parallel operations downgraded 25 to 50 pct 0
    1 760 ParallelServer Parallel operations downgraded 1 to 25 pct 0
    1 760 ParallelServer PX local messages sent 0
    1 760 ParallelServer PX local messages recv'd 0
    1 760 ParallelServer PX remote messages sent 0
    1 760 ParallelServer PX remote messages recv'd 0
    1 760 SQL & Cache buffer is pinned count 1753861
    1 760 SQL & Cache buffer is not pinned count 227318
    1 760 SQL & Cache no buffer to keep pinned count 0
    1 760 Cache table lookup prefetch client count 0
    1 760 SQL workarea memory allocated 487
    1 760 SQL workarea executions - optimal 141
    1 760 SQL workarea executions - onepass 0
    1 760 SQL workarea executions - multipass 0
    1 760 Debug LOB table id lookup cache misses 0
    1 760 SQL parse time cpu 1071
    1 760 SQL parse time elapsed 1838
    1 760 SQL parse count (total) 3296
    1 760 SQL parse count (hard) 3159
    1 760 SQL parse count (failures) 0
    1 760 SQL parse count (describe) 0
    1 760 SQL frame signature mismatch 0
    1 760 SQL execute count 3296
    1 760 User bytes sent via SQL*Net to client 675615
    1 760 User bytes received via SQL*Net from client 740333
    1 760 User SQL*Net roundtrips to/from client 4776
    1 760 User bytes sent via SQL*Net to dblink 0
    1 760 User bytes received via SQL*Net from dblink 0
    1 760 User SQL*Net roundtrips to/from dblink 0
    1 760 User bytes via SQL*Net vector to client 0
    1 760 User bytes via SQL*Net vector from client 0
    1 760 User bytes via SQL*Net vector to dblink 0
    1 760 User bytes via SQL*Net vector from dblink 0
    1 760 SQL sorts (memory) 1116
    1 760 SQL sorts (disk) 0
    1 760 SQL sorts (rows) 85895
    1 760 Debug OTC commit optimization attempts 0
    1 760 Debug OTC commit optimization hits 0
    1 760 Debug OTC commit optimization failure - setup 0
    1 760 User Workload Capture: size (in bytes) of recording 0
    1 760 User Workload Capture: dbtime 0
    1 760 User Workload Capture: user calls 0
    1 760 User Workload Capture: user calls flushed 0
    1 760 User Workload Capture: unreplayable user calls 0
    1 760 User Workload Capture: user txns 0
    1 760 User Workload Capture: user logins 0
    1 760 User Workload Capture: unsupported user calls 0
    1 760 User Workload Capture: errors 0
    1 760 User Workload Replay: dbtime 0
    1 760 User Workload Replay: network time 0
    1 760 User Workload Replay: think time 0
    1 760 User Workload Replay: time gain 0
    1 760 User Workload Replay: time loss 0
    1 760 User Workload Replay: user calls 0
    1 760 User Workload Replay: deadlocks resolved 0
    1 760 OS OS User time used 0
    1 760 OS OS System time used 0
    1 760 OS OS Maximum resident set size 0
    1 760 OS OS Integral shared text size 0
    1 760 OS OS Integral unshared data size 0
    1 760 OS OS Integral unshared stack size 0
    1 760 OS OS Page reclaims 0
    1 760 OS OS Page faults 0
    1 760 OS OS Swaps 0
    1 760 OS OS Block input operations 0
    1 760 OS OS Block output operations 0
    1 760 OS OS Socket messages sent 0
    1 760 OS OS Socket messages received 0
    1 760 OS OS Signals received 0
    1 760 OS OS Voluntary context switches 0
    1 760 OS OS Involuntary context switches 0


    longops
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     41998     41998     Blocks     10     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 41998 out of 41998 Blocks done     4129736862     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     40724     40724     Blocks     9     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 40724 out of 40724 Blocks done     4129736862     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     79627     79627     Blocks     21     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 79627 out of 79627 Blocks done     4129736862     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     77710     77710     Blocks     19     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 77710 out of 77710 Blocks done     4129736862     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     61108     61108     Blocks     15     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 61108 out of 61108 Blocks done     4129736862     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     61108     61108     Blocks     12     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 61108 out of 61108 Blocks done     1654903796     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     61108     61108     Blocks     14     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 61108 out of 61108 Blocks done     1654903796     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     15579     61108     Blocks     10344     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 15579 out of 61108 Blocks done     1654903796     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     61108     61108     Blocks     13     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 61108 out of 61108 Blocks done     1654903796     TABLE ACCESS     FULL
    760     Table Scan     ESRODS.EXTR_CMG_TRANSACTION_HISTORY     61108     61108     Blocks     13     Table Scan: ESRODS.EXTR_CMG_TRANSACTION_HISTORY: 61108 out of 61108 Blocks done     1654903796     TABLE ACCESS     FULL


    SID OPNAME
    ---------- ----------------------------------------------------------------
    TARGET SOFAR
    ---------------------------------------------------------------- ----------
    TOTALWORK UNITS ELAPSED_SECONDS SQL_PLAN_HASH_VALUE
    ---------- -------------------------------- --------------- -------------------
    SQL_PLAN_OPERATION SQL_PLAN_OPTIONS
    ------------------------------ ------------------------------
    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 41998
    41998 Blocks 10 4129736862
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 40724
    40724 Blocks 9 4129736862
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 79627
    79627 Blocks 21 4129736862
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 77710
    77710 Blocks 19 4129736862
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 61108
    61108 Blocks 15 4129736862
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 61108
    61108 Blocks 12 1654903796
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 61108
    61108 Blocks 14 1654903796
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 15592
    61108 Blocks 10681 1654903796
    TABLE ACCESS FULL

    760 Table Scan
    ESRODS.EXTR_CMG_TRANSACTION_HISTORY 61108
    61108 Blocks 13 1654903796
    TABLE ACCESS FULL
  • 18. Re: Long running update
    Randolf Geist Oracle ACE Director
    Currently Being Moderated
    855802 wrote:
    Can anyone explain why this simple update statement against a single partition in large table (~300,000 rows, ~1GB in size for the single partition) is taking very long time. The most unusual thing I see in the stats are HUGE number of buffer gets.

    Table def is below, and there are 25 local b-tree indexes also on this table (too much to paste here), each on a single column residing in seperate tablespace than the table.

    I don't have a trace and will not be able to get one. Any theories as to the high buffer gets? A simple table scan (which occurs many times in our batch) against a single partition takes usually between 30-60 seconds. Sometimes the table scan goes haywire and I see these huge buffer gets, somewhat higher disk reads, and much longer execution time. There are less than 3 million rows in the partition being acted on, and only updating a couple columns, I simply cannot understand why Oracle would be getting a block (whether it was in cache already or not) over 1 BILLION times to perform this update.

    This is Oracle 11g 11.1.0.7 on RHL 5.3, 2 node RAC but all processing on instance 1 and instance 2 shut down at this point to avoid any possibility of cache fusuion issues.
    As already suggested by Jonathan you might be hitting this nasty ASSM bug with row migrations, however your "consistent gets" statistics look much higher than what I saw so far in my cases - but your "db block gets" are also very high which is also something that can be seen from the statistics when hitting the bug. Of course you can get high "db block gets" also when not hitting the bug - so this isn't something to conclude from the statistics alone.

    From looking at the posted DDL I would have thought you're safe since you have defined a PCTREE 25 along with the compression - in my test cases as soon as Oracle has a little room left to manoeuvre for growing at least some of the rows within the same block without migrating then the bug usually does not surface, but then I have only tested some specific setups and yours might be quite different.

    Remember that using compression together with UPDATEs is not a good strategy in general - the rows modified usually will be uncompressed requiring more space within the block than in the original state. This means that depending on your row data the 25 percent left might be filled up very quickly with the uncompressed rows leading potentially to a lot of row migrations even with that PCTFREE definition. As soon as the contiguous number of rows migrated per block reaches a certain threshold the bugs seems to become relevant and can show up with any block size defined - it is not constrained to larger block sizes, it is simply more likely to show up there.

    It would be interesting to know:

    - The block size of the tablespace where the partition resides in
    - The number of migrated rows before and after the update in the partition (ANALYZE TABLE xxx PARTITION (yyy) LIST CHAINED ROWS before and after)
    - The average number of rows per block

    If you want to find out if you're hitting the bug or not you'll need SYSDBA privileges. You'll need Tanel Poder's "tpt_public" script collection that you can get from here: http://tech.e2sn.com/oracle-scripts-and-tools

    Then you can do (at least) one of the following

    - Run "ostackprof" on the session that is currently running the UPDATE command, e.g. as SYSDBA @ostackprof <SID> 0 50

    This will run 50 stack traces of the process running session <SID>

    Here's a (slightly truncated for better readability) sample output from OSTACKPROF that I got from running this while the session hit the bug on 11.1.0.7 64bit Win7:
    Below is the stack prefix common to all samples:
    ------------------------------------------------------------------------
    Frame->function()
    ------------------------------------------------------------------------
    # 44 ->000000007717F56D
    # 43  ->OracleThreadStart()
    # 42   ->opimai()
    # 41    ->opimai_real()
    # 40     ->sou2o()
    # 39      ->opidrv()
    # 38       ->opiodr()
    # 37        ->opiino()
    # 36         ->opitsk()
    # 35          ->ttcpip()
    # 34           ->opiodr()
    # 33            ->kpoal8()
    # 32             ->opiexe()
    # 31              ->updexe()
    # 30               ->updThreePhaseExe()
    # 29                ->updaul()
    # 28                 ->qerupFetch()
    # 27                  ->updrow()
    # 26                   ->kauupd()
    # 25                    ->kdusru()
    #  ...(see call profile below)
    #
    # -#--------------------------------------------------------------------
    # - Num.Samples -> in call stack()
    # ----------------------------------------------------------------------
        20 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfpblk()->
         6 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfpblk()->
         3 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfsall()->
         3 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfpblk()->
         2 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfpblk()->
         2 ->kduurp()->kdumrp()->kdtInsRow()->kdtgrs()->kdtgsp()->ktspgsp_main()->ktspscan_bmb()->ktspfsrch()->ktspfpblk()->
    If you're hitting the bug then the relevant functions that should be the most prominent in your stack trace profiling are "ktspscan_bmb" and "ktspfsrch" (in the lower stack sample list) - these are used by the internal space management layer (kts) for searching for a suitable block where to move the migrated row to.

    - Run the "bufprof.sql" also included in the scripts collection main directory, e.g. @bufprof <SID> 1000

    This will run 1000 samples on the db block handles used by the session to access the buffers. Note that the latest version (1.03) available on Tanel's homepage caused a ORA-07445 core dump in my SYSDBA session with 11.1.0.7. The old version of the script (1.02) however worked, therefore I'll paste it below if you get the same problem.

    This is a sample output from the script for the same session as above:
           SID   KCBBFWHY KCBBFSO_FLG TO_CHAR(K    KCBBFCM KCBBFSO_OWN      DEC KCBWHDES                       TOTAL_SAMPLES ACTIVE_PCT
    ---------- ---------- ----------- --------- ---------- ---------------- --- ------------------------------ ------------- ----------
           140          0           1    100001          0 000007FF1A655A18 CR  kdswh11: kdst_fetch                     1000       2470
           140          0           1    100001          0 000007FF1E1C9928 CR  kdswh11: kdst_fetch                     1000       2470
           140          0           1   4000000          2 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                      637    1573.39
           140          0           1         0          2 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                       70      172.9
           140          0           1   4000000          0 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                       52     128.44
           140          0           1         4        256 000007FF1E1C9928 CUR ktspfwh10: ktspscan_bmb                   29      71.63
           140          0           1         0          2 000007FF1E1C9928 CUR kdtwh00: kdtgrs                           23      56.81
           140          0           1         0          0 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                       19      46.93
           140          0           1         0          2 000007FF1E1C9928 CUR kduwh07: kdumrp                           16      39.52
           140          0           1   4000000          0 000007FF1E1C9928 CUR ktspfwh10: ktspscan_bmb                   11      27.17
           140          0           1         4        256 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                       10       24.7
           140          0           1   4000000          0 000007FF1DBA4BE0 CUR ktspbwh1: ktspfsrch                       10       24.7
           140          0           1   4000000          0 000007FF1DBA4F20 CUR ktspbwh1: ktspfsrch                        7      17.29
           140          0           1         0          2 000007FF1E1C9928 CUR kduwh01: kdusru                            5      12.35
           140          0           1         4          0 000007FF1DBA4F20 CUR ktspbwh1: ktspfsrch                        3       7.41
           140          0           1         0          0 000007FF1E1C8D90 CR  kdswh11: kdst_fetch                        3       7.41
           140          0           1         4          0 000007FF1E1C9928 CR  ktspswh12: ktspffc                         3       7.41
           140          0           1         4        256 000007FF1E1C9928 CUR ktspfwh12: ktspstchg                       3       7.41
           140          0           1         0        256 000007FF1E1C9928 CUR ktspfwh12: ktspstchg                       2       4.94
           140          0           1         0         16 000007FF1E1C9928 CUR ktsphwh39: ktspisc                         2       4.94
           140          0           1         4          0 000007FF1E1C9928 CUR ktspbwh1: ktspfsrch                        2       4.94
           140          0           1   4000000          0 000007FF1DBA4F20 CUR ktspfwh10: ktspscan_bmb                    1       2.47
           140          0           1         0        256 000007FF1E1C9928 CUR ktspfwh10: ktspscan_bmb                    1       2.47
           140          0           1   4000000          2 000007FF1E1C9928 CUR ktspfwh10: ktspscan_bmb                    1       2.47
           140          0           1         4          0 000007FF1DBA4BE0 CUR ktspbwh1: ktspfsrch                        1       2.47
           140          0           1         0          0 000007FF1E1C9928 CR  ktspswh12: ktspffc                         1       2.47
           140          0           1         4        256 000007FF1DBA4F20 CUR ktspbwh1: ktspfsrch                        1       2.47
           140          0           1         4        256 000007FF1DBA4BE0 CUR ktspbwh1: ktspfsrch                        1       2.47
           140          0           1   4000000          0 000007FF1DBA4BE0 CUR ktspfwh10: ktspscan_bmb                    1       2.47
    As you might have guessed - the same function names appear in the description as in the above stack traces.

    Here is the old "bufprof.sql" (V1.02):
    --------------------------------------------------------------------------------
    --
    -- File name:   BufProf 1.02 ( Buffer Get Profiler )
    -- Purpose:     Display buffer gets done by a session and their reason
    --
    -- Author:      Tanel Poder
    -- Copyright:   (c) http://www.tanelpoder.com
    --
    -- Usage:       @bufprof <SID> <#samples>
    --                  @bufprof 142 1000
    --
    -- Other:       This is an experimental script, which may or may not work for you
    --              It's dependent on the size of the cache buffers handles array
    --              (db_handles), so if you have lots of sessions configured
    --              scanning this repeatedly may be slow.
    --
    --------------------------------------------------------------------------------
    
    
    --DEF bufprof_cols=KCBBFSO_TYP,KCBBFSO_OWN,DECODE(KCBBFCR,1,'CR','CUR'),KCBBFWHR,KCBBFWHY,w.KCBWHDES,KCBBPBH,KCBBPBF,m.ksmmmval,p.sid,p.username,p.program
    --DEF bufprof_cols=KCBBFSO_OWN,DECODE(KCBBFCR,1,'CR','CUR'),w.KCBWHDES,KCBBPBF,m.ksmmmval,p.sid
    DEF bufprof_cols=p.sid,kcbbfwhy,kcbbfso_flg,TO_CHAR(kcbbfflg,'XXXXXXXX'),KCBBFCM,KCBBFSO_OWN,DECODE(KCBBFCR,1,'CR','CUR'),w.KCBWHDES
    
    COL kcbwhdes FOR A30
    
    COL bufprof_addrlen NEW_VALUE addrlen
    COL bufprof_addrmask NEW_VALUE addrmask
    
    SET TERMOUT OFF
    SELECT VSIZE(addr) bufprof_addrlen, LPAD('X',VSIZE(addr)*2,'X') bufprof_addrmask FROM x$kcbsw WHERE ROWNUM = 1;
    SET TERMOUT ON
    
    DEF num_samples=&2
    
    PROMPT
    PROMPT -- BufProf 1.02 (experimental) by Tanel Poder ( http://www.tanelpoder.com )
    PROMPT
    
    WITH
        s  AS (SELECT /*+ NO_MERGE MATERIALIZE */ 1 r FROM DUAL CONNECT BY LEVEL <= &num_samples),
        p  AS (SELECT p.addr paddr, s.saddr saddr, s.sid sid, p.spid spid, s.username, s.program, s.terminal, s.machine
               FROM v$process p, v$session s WHERE s.paddr = p.addr),
        t1 AS (SELECT hsecs FROM v$timer),
        samples AS (
            SELECT /*+ ORDERED USE_NL(bf) */
            &bufprof_cols,
            COUNT(*)                    total_samples
            FROM
                s, -- this trick is here to avoid an ORA-600 in kkqcbydrv:1
                (SELECT /*+ NO_MERGE */
                        b.*,
                        HEXTORAW( TRIM(TO_CHAR(TO_NUMBER(RAWTOHEX(b.kcbbfso_own),'&addrmask')+&addrlen*2,'&addrmask')) ) caddr -- call SO address
                        FROM x$kcbbf b
                        WHERE bitand(b.KCBBFSO_FLG,1) = 1
                        --AND   b.KCBBFCM > 0
                ) bf,
                p,
                x$kcbwh w
            WHERE
              1=1
            AND BITAND(bf.KCBBFSO_FLG,1) = 1  -- buffer handle in use
            AND bf.kcbbfwhr = w.indx
            AND (p.sid LIKE '&1' OR p.sid IS NULL)
            AND (p.sid != (select sid from v$mystat where rownum = 1))
            GROUP BY &bufprof_cols
        ),
        t2 AS (SELECT hsecs FROM v$timer)
    SELECT /*+ ORDERED */
        s.*
        , (t2.hsecs - t1.hsecs) * 10 * s.total_samples / &num_samples active_pct
    FROM
        t1,
        samples s,
        t2
    ORDER BY
        s.total_samples DESC
    /
    Hope this helps,
    Randolf

    Oracle related stuff blog:
    http://oracle-randolf.blogspot.com/

    Co-author of the "OakTable Expert Oracle Practices" book:
    http://www.apress.com/book/view/1430226684
    http://www.amazon.com/Expert-Oracle-Practices-Database-Administration/dp/1430226684
  • 19. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    This is great information Randolph.

    The next test i run will be without compression (whether I am hitting the bug or not). I will also follow up on the analysis you suggested to prove out the bug theory.

    Is there a fix for the bug?

    Block size is 8k, but this weekend we did the first test with a 16k block size. Was faster in our Dev environment thatn 8k, but we are running with only 1 thread and 1 month of data there.

    Once the batch is done this weekend, I will get the chained row count. The table is built and populated fresch each batch, so chained rows will start at 0. I know for sure there are chained rows because I can see in session_longops the number of blocks being scanned and it goes up with each table scan. Will get actual numbers once this batch is done.

    Avg row size is somewhere around 600 bytes, so there aren't many rows per block, but I'll get that too.

    Thanks for your help!
  • 20. Re: Long running update
    Randolf Geist Oracle ACE Director
    Currently Being Moderated
    855802 wrote:
    Is there a fix for the bug?
    As far as I know there is unfortunately no one-off patch available - it is also still reproducible in the 10.2.0.5 patch set, so it looks like there was a significant source code change necessary to fix the problem in 11.2 where the issue can not be reproduced.
    Block size is 8k, but this weekend we did the first test with a 16k block size. Was faster in our Dev environment thatn 8k, but we are running with only 1 thread and 1 month of data there.
    Remember that so far the bug was more likely to happen with larger block sizes, so I would be surprised in this case if moving to a larger block size actually prevents it from showing up resp. it might mean that you don't hit that bug.
    Once the batch is done this weekend, I will get the chained row count. The table is built and populated fresch each batch, so chained rows will start at 0. I know for sure there are chained rows because I can see in session_longops the number of blocks being scanned and it goes up with each table scan. Will get actual numbers once this batch is done.
    Just to get the terminology right this issue is about migrated rows, not chained rows - I guess you mean migrated rows when you refer to chained rows. Chained rows are rows that have multiple row pieces split across several blocks since the whole row cannot fit into a single block. Migrated rows are rows that fit into a single block but had to be moved out of block due to growth of the row and insufficient space in the original block. They show different behaviour in different scenarios so the differentiation is important.

    I still haven't understood exactly how your process works in terms of concurrent execution and number of iterations of the update, but it looks like that you basically have multiple executions of the UPDATE and some of the later executions potentially don't have any space left in the blocks due to the previous UPDATE activity and these are potentially the ones that then hit the bug.

    One the other hand migrating such large rows out of the block and leaving only the forward ROWID behind in my tests usually leaves sufficient room in the block for other rows to grow and therefore prevent some of the row migrations and therefore the bug doesn't show up. If you should really hit the bug I clearly haven't yet covered your scenario in my test cases.

    Hope that helps,
    Randolf
  • 21. Re: Long running update
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Randolf Geist wrote:
    From looking at the posted DDL I would have thought you're safe since you have defined a PCTREE 25 along with the compression - in my test cases as soon as Oracle has a little room left to manoeuvre for growing at least some of the rows within the same block without migrating then the bug usually does not surface, but then I have only tested some specific setups and yours might be quite different.
    Good point, of course.

    Given the 8KB block size, he's left about 2KB free space per block, which means (depending on the degree of compression and the accuracy of the 700 bytes that the execution plan suggests as the effective row size) means he could probably do at least two, if not three, "safe" updates to a block before the next update needed to migrate rows. Since he's updating about 1 row in 10 (at least, in the update provided) and the rows/blocks figures suggest about 20 rows per block, this may mean no problems. On the other hand, if he has many processes updating this partition at the same time (without locking each other) it's possible that the cumulative effect of several processes updating the same blocks could suddenly (and apparently randomly) make one process start doing much more work.

    Regards
    Jonathan Lewis
  • 22. Re: Long running update
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    855802 wrote:

    230 million buffer gets so far...
    These look like stats for a 16KB block size - so the newer version rather than the older.

    Are these statistics for just the one update, or did you select from v$sesstat since the start of the query ? We really need to be sure that the figures you give are about one query, otherwise we may assume that reasonable values generated by earlier queries are actually unreasonable contributions to the current query.

    A simple option would be to copy a "before" and "after" query into an excel spreadsheet and get it to calculate the difference.

    When you post results - don't include the zeros, and pleas use the "code" tags to preserve column widths (see note at end).
    Sesstats:
    1 760 User user commits 1712
    So possible the results after something like 1,712 updates ?
    1 760 User session logical reads 304868069
    1 760 Cache consistent gets - examination 132352982
    1 760 Debug cleanout - number of ktugct calls 132308286
    This shows that nearly half your buffer visits were related to visiting undo segement headers to deal with check the commit time for blocks holding committed transactions that needed delayed block cleanout.
    1 760 Cache db block changes 4468449
    1 760 Redo redo entries 2588072
    1 760 Redo redo size 1194274208
    I think you indicated that your update would affect only one index, but the number of redo entries seems much larger than it should be for 300,000 row updates with single index maintenance. (Again, this might be because it's multiple statements).

    I've just written a short note on my blog ( http://jonathanlewis.wordpress.com/2011/05/08/consistent-gets-3/ ) about the types of thing that can happen when you're doing this type of work; it's not a match for the way your results appear but, as Randolf pointed out, we really need to understand how the multiple processes fit together if we are to get a handle on the problem.


    Regards
    Jonathan Lewis

    To post code, statspack/AWR report, execution plans or trace files, START and END the text with the tag {noformat}
    {noformat} (the word "code" in lowercase, curly brackets, no spaces) so that the text appears in 
    fixed format
    .
    
    There is a +"Preview"+ tab at the top of the text entry panel. Use this to check what your message will look like before you post the message. If it looks a complete mess you're unlikely to get a response. (Click on the +"Plain text"+ tab if you want to edit the text to tidy it up.)                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                
  • 23. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    Thanks for all the input.

    Row migrations/Bug:
    I confirmed over the last day that we definately have row migrations. Started with zero and once the process was completed I looked at a single partition:
    create table CHAINED_ROWS (
     owner_name         varchar2(30),
        table_name         varchar2(30),
        cluster_name       varchar2(30),
        partition_name     varchar2(30),
        subpartition_name  varchar2(30),
        head_rowid         rowid,     -- actual chained row's head piece address in the segment
        analyze_timestamp  date
      );
    
    analyze table esrods.EXTR_CMG_TRANSACTION_HISTORY partition(p_esrods_Extr_cmg_tran_201010) list chained rows;
    
    select count(*) from esrods.EXTR_CMG_TRANSACTION_HISTORY partition(p_esrods_Extr_cmg_tran_201010)
    
      COUNT(*)
    ----------
       2492429
    1 row selected.
    
    select count(*) from chained_rows
    
      COUNT(*)
    ----------
        484157
    1 row selected.
    It seems that this in itself is a problem. The documentation is not explicit, but I conclude that the compress option overrides PCTFREE on the table. Since I am using compression, we are really using PCTFREE of 0. 484k migrated rows out of 2500k. Pretty significant. I'm not sure I can confirm the bug or not at this point, but is it safe to say that eliminating the row migrations will aleviate the bug? Inital tests without compressing yield longer run times for single thread, but still have yet to do full test to get overall time difference and monitor the behavior.

    2) Consistant reads
    I still think this is a possible problem. First we have to start with how the app works. It's fairly straight forward:

    The app can be run with n threads. Let's use 3 for this example. Each thread of the app is designed to opperate on one and only one partition in a working table. The working table is created from scratch at the beginning of the process - partitioned by month, and includes as many months of data that we need to apply the rules to. Let's use 24 for this example since that is what we are testing with in our test env. So the table is popluated, 24 partitions as defined above, each partition has 2-3 million rows, and less than 125k blocks (16k block size, double for 8 k). For example, the Nov 2011 partition has 2,967,211 rows and 135,220 blocks (~18 rows per block). Once the table is populated, PK and indexes are built and we run statistics.

    There is also a table called UPDATE_SQLS_TEMP which contains the update statements. For the first part of the b-rules there are about 1200 update statements. If we are processing 24 months of data, UPDATE_SQLS_TEMP will contain 1200 * 24 update statements. Each update statement has appended on it the appropriate date range such that it will only ever touch that one partition.

    Once the app kicks off the 3 threads. Each of those threads select the 1200 business rules for it's corresponding partition/date range one at a time, run the update statement, update the update_sqls_temp table to indicate that update was complete, and commit. We have confirmed the commits, and verified indeed the update statements/threading is working as designed and all date ranges on updates are correct. Once thread 1 completes, it will grab another date range/partition and start processing it, and so on until all partitions are processed.

    Logically, I don't see anyplace where the updates would need to go to undo for read consistency. The plans and traces all tell me that it is looking only at a single partition and the updates occur serially, which is why I cannot understand the high CR.

    Jonathan, it would be interesting to see the results of your test if each one of those rows was in a different partition, and therefore different blocks. That would better replicate this scenario.
  • 24. Re: Long running update
    Jonathan Lewis Oracle ACE Director
    Currently Being Moderated
    Colin T wrote:

    Once the app kicks off the 3 threads. Each of those threads select the 1200 business rules for it's corresponding partition/date range one at a time, run the update statement, update the update_sqls_temp table to indicate that update was complete, and commit. We have confirmed the commits, and verified indeed the update statements/threading is working as designed and all date ranges on updates are correct. Once thread 1 completes, it will grab another date range/partition and start processing it, and so on until all partitions are processed.
    If I've got it right:
    <ul>
    A process identifies one partition and a list of 1,200 update statements on that partition.
    It runs and commits each statement in turn
    Possible every statement requires a full scan of the partition
    </ul>

    The 25 PCTFREE does work with ASSM and compress - but after 1,200 update statements you've probably updated so many rows in each block that you've used the 25%. How many rows get updated by each statement - clearly if you have such a lot of migrated rows it's at least that many. If you dump some blocks that have appeared past the original end of the table you will probably find that each one has literally dozens of ITL entries. The whole strategy is leading to a mess that means you have to do some detailed analysis on. You could probably model this by copying one partition into a separate table and using a modified set of updates running from a simple SQL*Plus session.

    Logically, I don't see anyplace where the updates would need to go to undo for read consistency. The plans and traces all tell me that it is looking only at a single partition and the updates occur serially, which is why I cannot understand the high CR.
    I've already pointed out part of the cause: it's not doing "read consistency" it's doing 'get commit time - ktugct' which requires it to check the undo segment header of the transaction table slot identified by the uncleared ITL entries left in each block from previous update passes.
    Jonathan, it would be interesting to see the results of your test if each one of those rows was in a different partition, and therefore different blocks. That would better replicate this scenario.
    Not needed - each session is using a single partition, its migrated rows will go into the same partition; it's behaving as if it's accessing a single non-partitioned table.

    Regards
    Jonathan Lewis
  • 25. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    I did leave out one important piece (I included earlier in the thread) - there is a column called RULE_ID_2. It starts out null for all rows. Each update statement has a "where RULE_ID_2 is null" criteria, and also sets RULE_ID_2 to a number. So each row will only be updated once. In the one partition example above, there were about 500,000 rows that were not updated. So still there are probably many rows in each block updated, to your point. Each update though is pretty small. I would think that for those small values being updated we would not see huge amount of row migrations and usage of the 25% PCTFREE.

    I agree, the stratedgy is not optimum, but I'm stuck with it for now. Can I speed up the clearing of ITL entries?

    For your test example, it looks like you have multiple session updating the same table (partition) and therefore the same block(s). That doesn't happen in my situation. I only have a single session updating a table (partition) at a time. How is that the same? I don't see how session 1 updating partition 1 would cause session 2 updating partition 2 to wait. Can you clarify?

    If we are hitting the referenced bug, seems like if we don't use ASSM that might alleviate. This might be the next test after removing compression.
  • 26. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    Looking at cleanouts for last run in my test environment (24 months data processing over 7 hours):
    cleanout - number of ktugct calls 5,002,321,385

    Since my next test will be without compression, I did the test with/without compression in my dev environment, with just a single month of data/single thread. Here are some numbers:
         Option                                    Compress OFF         Compress ON
    Migrated rows                                         0              Substantial
    cleanout - number of ktugct calls               652,768               18,362,131
    consistent gets                               5,274,519                   23,513,012
    consistent gets - examination                   672,908               18,454,782
    redo size                                  912,611,900            4,932,805,620
    Edited by: Colin T on May 11, 2011 11:21 AM
  • 27. Re: Long running update
    Randolf Geist Oracle ACE Director
    Currently Being Moderated
    Colin T wrote:
    I did leave out one important piece (I included earlier in the thread) - there is a column called RULE_ID_2. It starts out null for all rows. Each update statement has a "where RULE_ID_2 is null" criteria, and also sets RULE_ID_2 to a number. So each row will only be updated once. In the one partition example above, there were about 500,000 rows that were not updated. So still there are probably many rows in each block updated, to your point. Each update though is pretty small. I would think that for those small values being updated we would not see huge amount of row migrations and usage of the 25% PCTFREE.
    Let's do the math with the latest information that you've provided:

    You have a 16K blocksize, but you start with an INITRANS of 100. That leaves us with approx. 14K of free space in the block of which 25% (ca. 3.5K) will be left free after the initial load. This means you have net space of approx. 10.5K for your 18 rows on average per block - this gives us an average row size of 580 bytes. This is nothing new since you've mentioned previously a row size of 600 bytes, but this is the compressed size of the row, if the compression actually does kick in (Oracle has the option to leave the rows uncompressed) and I was assuming that you meant the uncompressed row size.

    Regarding your comment above with the "small update": May be it slipped through, but it has been previously been mentioned that as you update a row in a table with basic compression the modified version of the row will be stored in uncompressed format.

    So no matter how "small" your update is - the question you have to address is: What is the difference in size between a compressed and uncompressed row, because this difference will be taken from your 25% (3.5K) initial space left free for every row updated. In addition of course your row might grow "naturally" by the updates since the updates might make the row content larger. So you could theoretically run into the same problem even without compression.

    However the basic compression adds the twist that the row will be stored uncompressed after the update and therefore potentially might require a lot more space than in compressed format.

    Note that once the row migrations kick in they leave a large free space behind in the block since the large row shrinks to a minimum stub with the forwarding address pointing to the migrated row.

    You said that without compression a single thread was slower than with compression. Since I would assume that you get less row migrations without compression this should mean that your rows allocate significantly more space in uncompressed format and therefore the resulting segment is larger in size. Any of the 1200 UPDATEs that use a full table scan now potentially take longer than before but the update itself should be faster since it has to do less work. That might explain that the overall runtime is longer due to the larger segment, but I would expect that you see less total buffer gets and no longer any exceptionally long-running statements in case you really hit the bug.
    I agree, the stratedgy is not optimum, but I'm stuck with it for now. Can I speed up the clearing of ITL entries?
    How about committing less often? Then you don't need to run the block cleanout with every iteration of the 1,200 updates.
    If we are hitting the referenced bug, seems like if we don't use ASSM that might alleviate. This might be the next test after removing compression.
    Your options to workaround the bug (if you hit it) are basically (in ascending order by effort):

    - Avoid row migrations
    - Move to MSSM tablespace
    - Upgrade to 11.2

    You could also try to move to tablespace with a smaller blocksize hoping that the bug does not show up there, but the best strategy is to avoid the row migrations in first place.

    Another idea that I haven't mentioned yet: If you happen to have the advanced compression option you could theoretically try to address the "row migration" problem by switching to "oltp compression" (in 11g "compress for all operations"). This is supposed to re-compress the data when the block needs to be "re-organized" in order to coalesce the free space in the block. However it obviously migrates the row rather than attempts to re-compress when a row growths. Up to 11.1 it even doesn't attempt to compress the migrated rows which is also something that changes with 11.2. Even more interestingly and quite counter-intuitively using "compress for all operations" in 11.1 seems to be the safest way to run into the bug, something I plan to blog about once I've written up my notes.

    Hope this helps,
    Randolf
  • 28. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    Thanks Randolph, Note that I updated the stats I sent earlier as I noticed I had an error. Basically if I turn off compression I get zero row migrations and all the stats look much better (including the DBC). I'll post the results of that test sometime tomorrow. Meanwhile, I'll consider possibility as you mentioned of advanced compression. Overall, I think the compression is really killing us, and whether we are hitting the bug or not, we should run more efficient by avoiding those without compression.
  • 29. Re: Long running update
    Colin Thiner Newbie
    Currently Being Moderated
    Oh, also missed that we already tried "compress for all operations", it reduced row migrations, but they were still prevalent.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points