This discussion is archived
2 Replies Latest reply: Sep 18, 2011 1:50 AM by 885300 RSS

Rep manager reports DB_NOTFOUND during log file archival

885300 Newbie
Currently Being Moderated
I seem to be getting the same problem as this guy here Understanding the archiving and its relation to PANIC Error

Each day before doing a db_hotbackup we archive our logs using this command
/usr/local/BerkeleyDB.5.2/bin/db_archive | xargs -tI '{}' mv '{}' ../db_log_archive/'{}'"

But last night while that command was executing it caused a database panic. Our log files showed these messages:

BDB: DB_ENV->rep_process_message: BDB0073 DB_NOTFOUND: No matching key/data pair found
BDB: message thread failed: BDB0073 DB_NOTFOUND: No matching key/data pair found
BDB: BDB0061 PANIC: BDB0073 DB_NOTFOUND: No matching key/data pair found
BDB: BDB0060 PANIC: fatal region error detected; run recovery

We are using Berkeley DB version 5.2.28

Is it possible that db_archive is returning logs that are needed during replication?

Edited by: 882297 on 16/09/2011 13:36
  • 1. Re: Rep manager reports DB_NOTFOUND during log file archival
    524722 Explorer
    Currently Being Moderated
    Do you know if anything unusual was going on with your system at the time
    of the panic? Can you contact me offline at firstname.lastname@oracle.com
    and send me the __db.rep.diag00 and __db.rep.diag01 files.

    Those files should exists whether or not you have verbose output turned on.
    However, if you could reproduce with verbose output that would be the
    best first step.

    My system only has -t for 'xargs'. What does '-tl' do? On my system, -t
    echoes the command. If that is what yours is doing, can you send that so
    that we know what log files it moved.

    Sue LoVerso
    Oracle
  • 2. Re: Rep manager reports DB_NOTFOUND during log file archival
    885300 Newbie
    Currently Being Moderated
    I have sent you the __db.rep.diag files and the output of the archive command I gave earlier. The -I command on the xargs just explicitly sets {} as the token that the name of the file is replaced with in the command line you specify.

    The only unusual thing that was happening to the system at the time was running the archive command. As far as I know, everything else was running smoothly. We have had our backup procedure set up this way and running in production for at least a few months now, so it may be rare.

    Unfortunately as this is a production system I am not amazingly keen to try and reproduce the problem. We actually had a bug in our DB recovery code so this ended up causing some downtime.

    We might be able to simulate some load on a test machine and then run the backup script again and again, but that would take a bit of time to set up. Hopefully you are able to see something without needing to do that.

Legend

  • Correct Answers - 10 points
  • Helpful Answers - 5 points