My scenario is:
1. I'm using Bdb db in multi threaded environment.
2. My process crashed due to some reason. Environment / db were not closed properly before the crash.
3. I restarted the process, and the Db::open call hanged.
Here is the stack trace:
Thread 66 (LWP 16050):
#0 0x00007ffb80857d84 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib/x86_64-linux-gnu/libpthread.so.0
#1 0x00007ffb8108134b in __db_pthread_mutex_lock () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#2 0x00007ffb8110f262 in __lock_get_internal () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#3 0x00007ffb8110f7bb in __lock_get () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#4 0x00007ffb811376bb in __db_lget () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#5 0x00007ffb81090145 in __bam_read_root () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#6 0x00007ffb8113b212 in __db_open () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#7 0x00007ffb81134ab8 in __db_open_pp () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#8 0x00007ffb81076a69 in Db::open(DbTxn*, char const*, char const*, DBTYPE, unsigned int, int) () from /usr/lib/x86_64-linux-gnu/libdb_cxx-5.1.so
#9 0x0000000000410aac in BdbDB::dbOpen (this=0x2174f60) at db.cpp:89
My pseudo source code is as below:
UINT32 db_flags, env_flags;
m_env = new DbEnv(0);
env_flags = DB_CREATE | /* Create the environment */
DB_INIT_LOCK | /* Initialize locking. */
DB_THREAD | /* Enable threading */
DB_INIT_MPOOL; /* Initialize the in-memory cache. */
m_env->open(m_dbPath.c_str(), env_flags, 0);
db_flags = DB_CREATE;
m_db = new Db(m_env, 0);
Kindly help me to find the solution for this hang situation.
~ Ashish K.
There are two problems with your code. First is that if you are enabling locking to use the flag DB_INIT_LOCK, then you need to either enable the environment for transactions (Chapter 11. Berkeley DB Transactional Data Store Applications), or for the concurrent data store (Chapter 10. Berkeley DB Concurrent Data Store Applications). The second issue is that after a crash, you should open the environment using the flag
DB_RECOVER, which will enable recovery which will clean up any left over locks after a crash. Note it is safe to use
DB_RECOVER every time you open the environment.
Enabling the transactions and adding below flags solved the issue for me.
DB_REGISTER | DB_RECOVER
Combination of these flags helped me to open the db successfully after application crash.
I have another question, I need the same recovery for the db which doesn't require transaction. Is it possible to avoid lock conflicts post application crash for db without enabling transaction?
The docs clearly states below:
Run normal recovery on this environment before opening it for normal use. If this flag is set, the
DB_INIT_TXN flags must also be set, because the regions will be removed and re-created, and transactions are required for application recovery.
Then in that case how to enable recovery post application crash for non-transactional dbs?
~ Ashish K.
Recovery is only possible when using transactions. In a non-transactional database, you can delete the environment files (files of the form __db####) after a crash to clear out leftover locks, but there is no way to recover any lost data or make sure the database is in a consistent state.