10.2.0.5.6
I have a ticket open with support related to the ORA-600 and ORA-7445 errors. I came here for help on how to identify corrupt data. Support can be really slow, and I find if I ask more than 1 question, it increases response time exponentially. I did not post the ORA-600 errors since I went to support on that and I wanted to keep this focused on how to find the bad data.
I do not have block corruption. I already checked for that. I will also explain why I think I have corrupt data. It is intermittent, but it is causing a very old job to fail. We do not have resources to change the code. It is also very complex. We have found a few corrupt records in 1 table, but we just blew up on a 40 gb table. Export/Import won't work. This is a 10 TB database. I basically need to find the bad data and get rid of it.
These are the things I have done so far:
STEP 1:
I am able to narrow it down to 1 field in a record. This field in this one table has a length of -1.
We running into an ora-600 error when we hit certain records and apply replace(to_char(field)) and we lose connection to the DB. I have a ticket open with oracle on this.
STEP 2:
I do not think the blocks are corrupt becasue I used analyze table validate structure and it came up fine. I do not have dbms_repair installed and I would need to go through an approval process to get it installed (even in a test DB).
Note: Field names have been changed to dummy values.
STEP 3:
The field in question is defined as number (15,3)
When I query the bad data in sqlplus it comes back as .0000. If you insert .0000 into a table, it rounds to 0
What makes it stranger if I query the same record in SQL Developer I get 6.5435 back. So a different value.
This tells me the data is corrupt. I can't figure out an easy way to find it.
STEP 4:
1 select rawtohex(badvar) myrawtohex ,dump(badvar) mydump,length(badvar) mylength
2 FROM mytab
3 where
4* rowid = '<myrowid>'
A325KJ@orpt3> /
MYRAWTOHEX MYDUMP MYLENGTH
-------------------- ---------------------------------------- ----------
C00000 Typ=2 Len=3: 192,0,0 -1
STEP 5:
create table badtable as
select *
from mytab
where length(badvar) = -1
returns 19 records. The data is still corrupted. Now a simple TO_CHAR creates a core dump and not a replace(to_char(badvar))
Step 6:
As stated above we ran into the same error in another table that is 40 gbs (and not partitioned). We have many more tables we may run into this. I don't know if all of bad data will have a length = -1.
Is there a way to scan the whole database for bad data? DB_VERIFY won't work because it looks for corrupt blocks. The blocks can't be corrupt, analyze table validate structure worked and I can query the data without a function.
We have 12-15 year old code that is failing on this. So I can't get the code changed to go around it (not sure of an easy way to do it).
I really just need to find the bad data and delete it.
Edited by: Guess2 on May 4, 2012 11:15 AM