Skip to Main Content

Oracle Database Discussions

Announcement

For appeals, questions and feedback about Oracle Forums, please email oracle-forums-moderators_us@oracle.com. Technical questions should be asked in the appropriate category. Thank you!

Interested in getting your voice heard by members of the Developer Marketing team at Oracle? Check out this post for AppDev or this post for AI focus group information.

Index "blank" blocks after large UPDATE that was rolled back

Vsevolod AfanassievJun 19 2015 — edited Jun 24 2015

Oracle 9.2.0.8

There is a large table TRANSACTIONS (2 billion rows, 160 GB), and has VARCHAR2(10) column TRANSACTION_ID. The column has approx 200 million distinct values.

There is a single-column index on the column.

A user executed UPDATE that set all rows to a single value: UPDATE TRANSACTIONS SET TRANSACTION_ID = :value (we don't know the value as this database has cursor sharing enabled).

The UPDATE failed (probably run out of space in UNDO) and it was rolled back. Then we noticed that for a certain value of TRANSACTION_ID the query using the index is very slow.

Even the simplified query that accesses only index took 40 min and returned only one row:

SELECT TRANSACTION_ID FROM TRANSACTION WHERE TRANSACTION_ID = '12345ABCDE';

Also the size of index increased by 20%.

After tracing the query we found that it spends time on 'db file sequential read' (as expected). Using p1, p2 from the trace file and doing block dump we found that the index contains thousands of "blank" blocks - example below.

These blank blocks don't have "rows" section. Once the index was rebuilt the issue disappeared.

Question: is it normal for "blank" blocks to appear after large transaction that was rolled back? I had a look at Richard Foote's blog but could't find any references to this behaviour.

Example of blank block:

Leaf block dump

===============

header address 4350311420=0x1034c7bfc

kdxcolev 0

KDXCOLEV Flags = - - -

kdxcolok 0

kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y

kdxconco 2

kdxcosdc 1

kdxconro 0

kdxcofbo 36=0x24

kdxcofeo 4104=0x1008

kdxcoavs 7588

kdxlespl 0

kdxlende 0

kdxlenxt 2135123197=0x7f4364fd

kdxleprv 2135123073=0x7f436481

kdxledsz 0

kdxlebksz 7624

----- end of leaf block dump -----

End dump data blocks tsn: 21 file#: 509 minblk 222399 maxblk 222399

Comments

Martin Preiss

at least the part of scanning empty index blocks that are still part of the index structure is described in https://richardfoote.wordpress.com/2008/07/01/deleted-index-entries-part-v-trouble/. Since index entries are marked as deleted and not physically deleted immediately but left to a housekeeping by following operations I could imagine that this cleanup operation took place before the transaction was rolled back completely and the index entires were recreated in a different place. But that's just guessing - and should be tested. Or Mr. Foote or someone else could provide the correct answer

Stefan Koehler

Hi Vsevolod,

unfortunately i am not Richard "Mr. Index" Foote , but i think i can demo this very clearly with a tiny test case. At first you should know that an UPDATE (for an index) is a DELETE and INSERT operation in consequence. The old index entry is marked as deleted and the new (updated) index entry is inserted in the correct place in the index structure. So just let's demo this.

SQL> create table T (a number not null);

SQL> begin

for i in 1 .. 2500 loop

  insert into T values (1);

end loop;

for i in 1 .. 500 loop

  insert into T values (3);

end loop;

commit;

end;

/

SQL> create index T_I on T(a);

SQL> exec dbms_stats.gather_table_stats(USER,'T');

SQL> select object_id from user_objects where object_name = 'T_I';

OBJECT_ID

----------

    76287


SQL> select * /*+ INDEX_FFS(t t_i)  */ from t;

Statistics

----------------------------------------------------------

0  db block gets

211  consistent gets

0  physical reads

SQL> alter session set events 'immediate trace name treedump level 76287';

----- begin tree dump

branch: 0x100010b 16777483 (0: nrow: 6, level: 1)

   leaf: 0x100010c 16777484 (-1: nrow: 512 rrow: 512)

   leaf: 0x100010d 16777485 (0: nrow: 512 rrow: 512)

   leaf: 0x100010e 16777486 (1: nrow: 512 rrow: 512)

   leaf: 0x100010f 16777487 (2: nrow: 512 rrow: 512)

   leaf: 0x1000110 16777488 (3: nrow: 512 rrow: 512)

   leaf: 0x1000111 16777489 (4: nrow: 440 rrow: 440)

----- end tree dump

SQL> update T set a=2 where a=1;

SQL> alter session set events 'immediate trace name treedump level 76287';

----- begin tree dump

branch: 0x100010b 16777483 (0: nrow: 15, level: 1)

   leaf: 0x100010c 16777484 (-1: nrow: 512 rrow: 0)

   leaf: 0x100010d 16777485 (0: nrow: 512 rrow: 0)

   leaf: 0x100010e 16777486 (1: nrow: 512 rrow: 0)

   leaf: 0x100010f 16777487 (2: nrow: 512 rrow: 0)

   leaf: 0x1000110 16777488 (3: nrow: 291 rrow: 0)

   leaf: 0x1000112 16777490 (4: nrow: 291 rrow: 130)

   leaf: 0x1000113 16777491 (5: nrow: 291 rrow: 291)

   leaf: 0x1000114 16777492 (6: nrow: 291 rrow: 291)

   leaf: 0x1000115 16777493 (7: nrow: 291 rrow: 291)

   leaf: 0x1000116 16777494 (8: nrow: 291 rrow: 291)

   leaf: 0x1000117 16777495 (9: nrow: 291 rrow: 291)

   leaf: 0x100011d 16777501 (10: nrow: 291 rrow: 291)

   leaf: 0x100011e 16777502 (11: nrow: 291 rrow: 291)

   leaf: 0x100011f 16777503 (12: nrow: 393 rrow: 393)

   leaf: 0x1000111 16777489 (13: nrow: 440 rrow: 440)

----- end tree dump

SQL> rollback;

SQL> alter session set events 'immediate trace name treedump level 76287';

----- begin tree dump

branch: 0x100010b 16777483 (0: nrow: 15, level: 1)

   leaf: 0x100010c 16777484 (-1: nrow: 512 rrow: 512)

   leaf: 0x100010d 16777485 (0: nrow: 512 rrow: 512)

   leaf: 0x100010e 16777486 (1: nrow: 512 rrow: 512)

   leaf: 0x100010f 16777487 (2: nrow: 512 rrow: 512)

   leaf: 0x1000110 16777488 (3: nrow: 291 rrow: 291)

   leaf: 0x1000112 16777490 (4: nrow: 161 rrow: 161)

   leaf: 0x1000113 16777491 (5: nrow: 0 rrow: 0)

   leaf: 0x1000114 16777492 (6: nrow: 0 rrow: 0)

   leaf: 0x1000115 16777493 (7: nrow: 0 rrow: 0)

   leaf: 0x1000116 16777494 (8: nrow: 0 rrow: 0)

   leaf: 0x1000117 16777495 (9: nrow: 0 rrow: 0)

   leaf: 0x100011d 16777501 (10: nrow: 0 rrow: 0)

   leaf: 0x100011e 16777502 (11: nrow: 0 rrow: 0)

   leaf: 0x100011f 16777503 (12: nrow: 60 rrow: 60)

   leaf: 0x1000111 16777489 (13: nrow: 440 rrow: 440)

----- end tree dump

SQL> select dbms_utility.data_block_address_file(16777501) file#, dbms_utility.data_block_address_block(16777501) block# from dual;

     FILE#     BLOCK#

---------- ----------

  4  285

SQL> alter system dump datafile 4 block 285;

Block header dump:  0x0100011d

Object id on Block? Y

seg/obj: 0x129ff  csc: 0x00.17a27a  itc: 2  flg: E  typ: 2 - INDEX

     brn: 0  bdba: 0x1000118 ver: 0x01 opc: 0

     inc: 0  exflg: 0

Itl           Xid                  Uba         Flag  Lck        Scn/Fsc

0x01   0x0001.019.0000046d  0x00c12a77.00b5.01  -BU-    1  fsc 0x0000.0017a27b

0x02   0x0009.016.000005f7  0x00c004a1.0224.03  ----    0  fsc 0x0000.00000000

Leaf block dump

===============

header address 140525473464932=0x7fcea2ed8264

kdxcolev 0

KDXCOLEV Flags = - - -

kdxcolok 1

kdxcoopc 0x80: opcode=0: iot flags=--- is converted=Y

kdxconco 2

kdxcosdc 2

kdxconro 0

kdxcofbo 36=0x24

kdxcofeo 4540=0x11bc

kdxcoavs 7996

kdxlespl 0

kdxlende 0

kdxlenxt 16777502=0x100011e

kdxleprv 16777495=0x1000117

kdxledsz 0

kdxlebksz 8032

*** dummy key ***

row#0[6748] flag: ----S-, lock: 2, len=12

col 0; len 2; (2):  c1 03

col 1; len 6; (6):  01 00 01 06 01 c1

----- end of leaf block dump -----

SQL> select * /*+ INDEX_FFS(t t_i)  */ from t;

Statistics

----------------------------------------------------------

0  recursive calls

0  db block gets

224  consistent gets

0  physical reads

So you can see the exact same behavior as in your system. The index is expanded by the UPDATE from 6 to 15 leaf blocks. The old leaf blocks (for value 1) still contain 512 index entries (nrow: 512), but all of them are marked as deleted / deleted (rrow: 0). The new index leaf blocks contain your update value (2) now. After the rollback these "new" leaf blocks still remain in the index structure and may be scanned (depends on your access path), but are empty again (nrow: 0 rrow: 0). No marked entries for deletion, just empty! You get the same block content as in your case, if you dump such a block.

Afterwards if you use that index to scan your data (once again the consequence depends on your access path) the I/O increases from 211 to 224 (with a standard SQL*Plus array size of 15). I hope this clearly demonstrates the impact and behavior in the index structure in case of an UPDATE.

> Question: is it normal for "blank" blocks to appear after large transaction that was rolled back?

.. or to make it very short: Yes, it is.

Regards

Stefan

Jonathan Lewis

I got some good comments on this topic on my blog some time ago in a little quiz item about a particular difference between tables and indexes: https://jonathanlewis.wordpress.com/2010/03/25/index-too-big/

Regards

Jonathan Lewis

User_UQDX2
John Stegeman

Why would some anonymous, unattributed content be more helpful than the content that has already been posted (and attributed) and followed-up with comments against that content?

Richard Foote-Oracle

Hi Vsevolod

Yes, as others have said, when you update a non-unique indexed column, the existing index entries are effectively deleted and the new index entries are added for the new column values. If the transactions rolls back, the new index entries are then removed but any new leaf blocks remain in the index structure. This can indeed result in performance penalties if these now empty leaf blocks need to be accessed which could well be the case for the select on the equality predicate.

What I find interesting in your example though is that someone attempted to update a column that previously had 200 million distinct values with the same value. And yet still want an index on this column that would now only have the same value for all 2 billion rows in the table, an index with now questionable merit. Perhaps it was a very good thing that the update ultimately failed .

Cheers

Richard Foote

https://richardfoote.wordpress.com/

Vsevolod Afanassiev

Thanks to all for responding. What concerns me most isn't the presence of empty blocks, but the impact on query performance. Here we have a simple query with equality condition (TRANSACTION_ID = value) and normally it takes milliseconds. After an UPDATE (or DELETE and INSERT) the query is taking very long time as instead of reading one leaf block it has to read what appears to be a linked list of empty leaf blocks before if finally gets to the block containing the value. The situation with 2 billion rows being updated in a single transaction isn't very common, but smaller multi-row updates do occur. And they tend to get bigger as developers know to avoid frequent commits and the size of UNDO tablespace allows transaction of any size.

I did a simple calculation:

- let's say a leaf block contains 300 values

- if a transaction updates 300,000 values then we may get a linked list of 1,000 empty blocks

- assuming that all reads are physical and 5 milliseconds per read it will take 5 seconds to read this list

5 seconds doesn't sound too much, but there are applications where timeout is set to a small value (I've seen 10 seconds) and a query running for 5 sec may cause timeout.

And this timeout will be repetitive, i.e. if the query is re-run for the same value it will again take long time (ignoring caching).

I am not keen on rebuilding indexes all the time, but it seems like the only option.

unknown-7404

What concerns me most isn't the presence of empty blocks, but the impact on query performance.

Huh? Those are ONE AND THE SAME - it is the empty blocks that cause the performance issue.

I am not keen on rebuilding indexes all the time, but it seems like the only option.

Isn't the solution to NOT update the column to the same value for all of those rows?

Richard Foote-Oracle

Hi Vsevolod

The impact of the empty leaf blocks will depend entirely on what value you set the transaction_id to during the update and what value you then subsequently select.

If you update the column to say the minimum value of the column (say 'A'), update millions of rows and then rollback (or transaction fails and it automatically rolls back), you have an index structure now that has branches pointing to the left most leaf blocks as if containing the values 'A'. If you then try and select transaction_id='A', then Oracle will indeed have to navigate through all the now empty leaf blocks searching in vain for the value which may not exist (or only exists as they did previous to the update). This will of course be slow and require fixing.

If however you search for an entirely different value in which the index branches can directly point to the correct starting position within the leaf blocks, then performance won't be impacted at all by the empty leaf blocks as they're not accessed.

So it all depends.

Certainly updating all rows (or a large proportion of rows) to the same value will have a considerable impact if you then want to effectively access the corresponding rows via an index (whether you commit or rollback the transaction). The empty blocks will eventually get recycled but perhaps not in time to save the performance of queries that search for the updated values.

Cheers

Richard Foote

https://richardfoote.wordpress.com/

Jonathan Lewis

Being aware of potential problems is important, but considering how they might appear in your own system is equally so.

The problem of having to walk 1,000 empty index leaf blocks because of data deletion means that at some stage you've had a query - either something of the form "colx = {constant}" or "colx between {k1} and {k2}" which also had to walk that 1,000 blocks when they were full. If that's supposed to be a possible threat (rather than the side effect of an accident) then you need to think carefully about what you're doing, how often you're doing it, and whether it's something that needs a proper design review.

Think of it as being similar to the table high water mark problem.  If you insert 1M rows into a table and then delete them a tablescan of the table is still going to walk through the space for 1M rows. It used to be a fairly standard problem in the past that some programs would regularly insert and delete a few rows (tens) into a scratch table every few minures during the day, and a batch process would insert and delete 1,000,000 row once per night.  Leaving the daytime users scanning a very large empty table unless the last step of the batch was (e.g.) able to truncate the table. That problem disappeared (or should have) with the appearance of global temporary tables - but it still gives a hint about how extreme cases of index maintenance might need special thinking.

Regards

Jonathan Lewis

Martin Preiss

Vsevolod Afanassiev wrote:

I had a look at Richard Foote's blog but could't find any references to this behaviour.

obviously just a small temporal anomaly: https://richardfoote.wordpress.com/2015/06/23/empty-leaf-blocks-after-rollback-part-i-empty-spaces/ and https://richardfoote.wordpress.com/2015/06/24/empty-leaf-blocks-after-rollback-part-ii-editions-of-you/.

1 - 12
Locked Post
New comments cannot be posted to this locked post.

Post Details

Locked on Jul 21 2015
Added on Jun 19 2015
12 comments
8,959 views