Forum Stats

  • 3,827,489 Users
  • 2,260,784 Discussions
  • 7,897,278 Comments

Discussions

Over 4GB read-only mmap B-tree fails to find records

User_DAIKJ
User_DAIKJ Member Posts: 5 Green Ribbon

Systems: Windows 10, RHLE, HP-UX, always a 64-bit compile.

BDB: 5.x, 6.x, 18.x

This problem exists on every system I have tried, with every version of BDB I can get my hands on.

  1. Create a b-tree with a size just over 4GB. I use a 64-bit unsigned key and a 24-byte record, which takes about 67,200,300 records to get over 4GB (4,341,264,384). The same key/record setup with 66,200,300 records is just under 4GB and does NOT exhibit the problem.
  2. After generation, reopen the database "read-only" (DB_RDONLY) with an environment, cache, and mmap setting to allow BDB to mmap the file. I use a 5GB cache and set the mmap limit to 16GB.
  3. Start looking up the records. Very quickly db->get will fail with BDB0073 DB_NOTFOUND. The lib also produces a BDB3008 message: "dirty flag set for readonly file page". If you keep calling db->get, then the lib will panic.
  4. Reopen the database, but this time pass the DB_NOMMAP to the db->open call. Notice how all the keys are found.
  5. Generate a database with 66,200,300 records of the same type as above. Notice that the file is just under 4GB (4,247,216,128).
  6. Reopen the under 4GB database the same as step #2, i.e. read-only, cache and mmap set to allow BDB to mmap the file. Notice how all the keys are found.
  7. Use db_stat to see there is nothing wrong found with either db file.
  8. Use db_dump to view the first 100 or so records and notice that the key that failed to be found above, is indeed in the database file.

Using a different OS, compiler, or system has not made any difference with this problem, neither has 4096 vs 8192 size pages, and making sure the page size matches the filesystem.

Something else that is strange. If you generate the database, sync it, close it, then reopen it read-only, all in the same run of a program within the same db environment, then the database over 4GB checks out fine, i.e. all the keys are found. However, exit the program and reopen the database read-only for checking, and the same program / code-path fails as above.

Any insight, questions, or suggestion would be greatly appreciated.

Answers

  • User_DAIKJ
    User_DAIKJ Member Posts: 5 Green Ribbon

    Never mind on the "over 4GB" part, today I created a b-tree that had the same 67,200,300 records that was under 4GB, yet there is still the same "missing keys" problem when using the database in subsequent runs of the programs that need the databases.

    This seems like a pretty significant problem, and since there appears to be almost no active participation for the Berkeley DB here (certainly no devs or maintainers hanging out on the forum), I have lost confidence in the code-base and I will have to find and start using something else.