Forum Stats

  • 3,728,709 Users
  • 2,245,678 Discussions
  • 7,853,708 Comments

Discussions

Stale data visible on oracle index

Nikhil Juneja
Nikhil Juneja Member Posts: 196 Bronze Badge
edited August 2019 in SQL & PL/SQL

Hi All,

Oracle version

Oracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit Production

PL/SQL Release 11.2.0.4.0 - Production

CORE 11.2.0.4.0 Production

TNS for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - Production

NLSRTL Version 11.2.0.4.0 - Production

We have a strange situation in our Oracle production DB, where we are getting data from index, but table doesn't have any records for those values.

I know it sounds a bit weird, but when i query only indexed columns i get results but not when i query the entire row.

Is there any known bug with Oracle? And what should be our best solution, rebuild or drop and recreate index?

We also want to find the root cause what caused it.

Regards,

Nik

Tagged:
Nikhil Junejamathguy

Answers

  • John Thorton
    John Thorton Member Posts: 14,493 Silver Crown
    edited August 2019
    Nikhil Juneja wrote:Hi All,Oracle versionOracle Database 11g Enterprise Edition Release 11.2.0.4.0 - 64bit ProductionPL/SQL Release 11.2.0.4.0 - ProductionCORE 11.2.0.4.0 ProductionTNS for IBM/AIX RISC System/6000: Version 11.2.0.4.0 - ProductionNLSRTL Version 11.2.0.4.0 - ProductionWe have a strange situation in our Oracle production DB, where we are getting data from index, but table doesn't have any records for those values.I know it sounds a bit weird, but when i query only indexed columns i get results but not when i query the entire row.Is there any known bug with Oracle? And what should be our best solution, rebuild or drop and recreate index?We also want to find the root cause what caused it.Regards,Nik

    how can we reproduce what you report?

    Nikhil Juneja
  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Thanks John for prompt response.

    That's one of the issue we are facing, not reproducible.

    i will give an e.g. of what is happening:

    select  sd.instrument_id,sd.account_id,sd.subaccount_cd,sd.end_dt from sd_pos_history sd

    where sd.account_id = 16892805

    and sd.instrument_id <> round(sd.instrument_id)

    ;

    --this returns records as we have a primary key index on all these 4 columns,

    but when we run a query with select for.e.g

    select  sd.start_dt from sd_pos_history sd

    where sd.account_id = 16892805

    and sd.instrument_id <> round(sd.instrument_id)

    it is not giving any record.

    Regards,

    Nik

  • John Thorton
    John Thorton Member Posts: 14,493 Silver Crown
    edited August 2019
    Nikhil Juneja wrote:Thanks John for prompt response.That's one of the issue we are facing, not reproducible.i will give an e.g. of what is happening:select sd.instrument_id,sd.account_id,sd.subaccount_cd,sd.end_dt from sd_pos_history sdwhere sd.account_id = 16892805and sd.instrument_id <> round(sd.instrument_id);--this returns records as we have a primary key index on all these 4 columns, but when we run a query with select for.e.g select sd.start_dt from sd_pos_history sdwhere sd.account_id = 16892805and sd.instrument_id <> round(sd.instrument_id)it is not giving any record.Regards,Nik

    Interesting observations.

    Folks here can only make random guesses & speculation without having any reproducible test case.

    If you don't mind obliterating what evidence now exists, you could blindly just rebuild the index & see what results. Of course if you do, you likely will never learn or understand how things got into this situation.

    What is your primary goal? To make things work as expected & desired? To learn why the current situation exists so as to avoid it in the future?

    Nikhil Junejamathguy
  • mathguy
    mathguy Member Posts: 9,731 Gold Crown
    edited August 2019

    I assume you thought about the obvious ways that can happen, but just in case...

    Are you running both queries from the same session? If not, of course, one session may have uncommitted changes, which are visible to that session but not to the other one.

    Even if you are running both queries from the same session - are you sure there is no DML activity on the table, between the runs? That means both DML statements in the same session (committed or not) AND committed DML transactions from other sessions. The latter are more likely to be overlooked.

    Are you able to run the two queries (one that reads from the index, one from the table) repeatedly, with the same values in the WHERE clause, and consistently get the same behavior? Rows are found when reading from the index, but no rows found when reading from the full table?

  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Agree to all points here.

    Since it is one of the core table and the issue is not widespread so we decided not to rebuild as our management was scared to cause index being unusable as that would impact GUI too.

    So as per current status we are not rebuilding the index.

    Since it is a live production environment (not used extensively on weekends) , so we are trying to see why the current situation is coming and how it came in first instance. Also how can be avoid it in future.

    Regards,

    Nik

  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    yes we checked all these things.

    Session is same, even different sessions give the same result.

    This table is mostly updated overnight by batch jobs.

    I ran the same queries multiple times and got same result.

  • Paulzip
    Paulzip Member Posts: 8,259 Blue Diamond
    edited August 2019

    It's difficult to know if you have a corrupt index(es) or corrupt blocks or similar.  I've seen the your symptoms in the past, but we didn't have time to fully investigate due to customer pressure, so were forced to rebuild the indexes out of necessity.  This cured the issue.

    You might benefit from running DBMS_REPAIR, it may give you information on what the issue is.

    Nikhil JunejaNikhil Juneja
  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,587 Gold Crown
    edited August 2019

    Index-only query

    select sd.instrument_id,sd.account_id,sd.subaccount_cd,sd.end_dt
    from sd_pos_history sdwhere sd.account_id = 16892805and sd.instrument_id <> round(sd.instrument_id);
    Query not via index - no rows returned.
    select sd.start_dt from sd_pos_history sdwhere sd.account_id = 16892805and sd.instrument_id <> round(sd.instrument_id)

    Does the second query execute a tablescan or use another index ?  If the former than that suggests you have a corrupted primary key index.

    Do a "select /*+ full(t) */ count(*) from table t" and "select /*+ index_ffs(t, pk) */ count(*) from table t" to see if the report the same number of rows.

    For the primary key acccess - select also the rowid, instrument_id, round(instrument_id) - set the numformat display to show 38 places after the decimal point.

    For the rowid reported, select from the table for that rowid to see what happens.

    There is a VALIDATE command that you can use to check the structure of the index to make sure it's self-consistent. From dim and distant memory I think that ic you also use the CASCADE option it will check if the index is consistent with the table.  THIS WILL LOCK THE TABLE.while it runs.

    Regards

    Jonathan Lewis

    Nikhil Juneja
  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Second query uses another non -unique index as start_dt is not a part of primary key.

    I did run the rowid queries:

    from index the values are a some weird values like 0.000000000000000000000000000000000000000.......4 for one of the columns

    but if i run the query from table the values are showing fine.

    To add:

    we are sometimes getting the following error too while running a select:

                                 *

    ERROR at line 1:

    ORA-00600: internal error code, arguments: [kdsgrp1], [], [], [], [], [], [],

    [], [], [], [], []

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,587 Gold Crown
    edited August 2019

    It certainly sounds as if you have some sort of index corruption - but it may be more serious than that.

    Please clarify the following:

    Did you check whether the table and the primary index had the same number of rows/entries ?

    Did you mean by your comments about "from index" / "from table" that when you selected the data for account_id 16892805 using an index-only path against the primary key you found some wierd values, but when you used the reported rowids to visit the table (a) you acquired a row every time, and (b) those rows showed "proper" values.  If that's the case then, for the index entries with the wierd values, did the rounded index values match the table values or were they complete different from the table values, or (long-shot) did they look like they could be the table values divided by a huge power of 10 ?

    What is the data type of the instrument_id -- as shown by a DESCRIBE.  (If it's NUMBER tell me the precision and scale reported - and if the precision and/or scale are not set make sure you state that explicitly).

    One of the causes of ORA-00600 [kdsgrp1] is simply "corrupt index" - so maybe you've got other index entries that don't even point to legal rows in the table. (grp is short for "get row piece").  So you do need to validate the structure.  The trace file dump from the ORA-00600 should report the block that triggered the error, and it may even do a symbolic dump of the block; a block dump will tell you which object the block belongs to and may give you some clues about the corruption - it's probably something that Oracle will ask for if you raise an SR.

    I may have been wrong about locking when you validate structures. There are "online" options:

    analyze index t1_i1 validate structure online;

    analyze table t1 validate structure cascade online;

    The problem with validating the index structure is that it may be internally consistent but not agree with the table, which is why you may have to move on to the table option.

    Regards

    Jonathan Lewis

  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Sorry forgot to mention that.

    yes the count was same with index and full.

    Did you mean by your comments about "from index" / "from table" that when you selected the data for account_id 16892805 using an index-only path against the primary key you found some wierd values, but when you used the reported rowids to visit the table (a) you acquired a row every time, and (b) those rows showed "proper" values.  If that's the case then, for the index entries with the wierd values, did the rounded index values match the table values or were they complete different from the table values, or (long-shot) did they look like they could be the table values divided by a huge power of 10 ?

    Yes they showed weird values with index. Rounded values don't match as they are almost 0 (value is 0.0000000000000004(130 times 0)

    Column value for that instrument is and should be 4 (say for one example)

    Datatype of column is number (without precision)

    As of now we are holding on to validate the structure as we want to find the root cause. I will raise an Oracle SR in the meantime too.

    I have asked our DBA to run DBMS_REPAIR to check the corrupt blocks.

    Regards,

    Nik

  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Hi All,

    Thanks for your inputs. We did a rebuild of all the indexes and it fixed the issue.

    Just for information: do we need to regularly rebuild the indexes say every 6 months or so. I know this is a never ending discussion if i search for forums on google.

    The reason I am asking is, our management wants to know the root cause of the issue and preventative actions for the same.

    Also shall we reach out to Oracle using SR for getting the root cause identified?

    Regards,

    Nik

  • Jonathan Lewis
    Jonathan Lewis Member Posts: 9,587 Gold Crown
    edited August 2019
    Nikhil Juneja wrote:
    Thanks for your inputs. We did a rebuild of all the indexes and it fixed the issue.
    Just for information: do we need to regularly rebuild the indexes say every 6 months or so. I know this is a never ending discussion if i search for forums on google.The reason I am asking is, our management wants to know the root cause of the issue and preventative actions for the same.Also shall we reach out to Oracle using SR for getting the root cause identified?

    There's no point in doing a regular rebuild to pre-empt this issue - it looks like a bug and since we don't know why it happened we can't say that it won't happen 30 seconds after your  next rebuild.  In the short term you just have to live with the fact that maybe the bug will reappear at random.

    DId you try any validate options to see if Oracle could detect the defect ?

    Unless you took a backup of the database and can run it somewhere else there's probably on point in raising an SR after you've destroyed the evidence.  You might get lucky, you might have enough details for an internal Oracle search to find a matching bug - but don't hold your breath waiting for an answer.

    Regards

    Jonathan Lewis

    P.S,  The example of the mismatch you gave looked like the index had the correct "precision" (i.e. value) at the wrong "scale" - viz: 130 decimal places to the left. This could be caused by a single bad byte getting into the data.  If the same pattern appeared from the other bad examples then possibly that's enough of a description of the problem for it to be recognised as a known (unpublished) bug.

    Nikhil JunejaNikhil Juneja
  • Nikhil Juneja
    Nikhil Juneja Member Posts: 196 Bronze Badge
    edited August 2019

    Thanks again Jonathan, lesson learnt. Next time if we see any such weird behavior we will involve DBA's early and push them to raise SR for finding the root cause.

    As you mentioned, we have already destroyed the evidence.

    Regards,

    Nik

Sign In or Register to comment.