1 2 Previous Next 24 Replies Latest reply: Jul 7, 2012 4:22 PM by 909806 RSS

    Data Bloat in je 4.1.17

      We upgraded to 4.1.17 from 4.0.92 and we are seeing heavy increase in the data size on disk. DBPrintLog reports massive amounts of DBINs and DINs. A cursor walk of the database checking for number of duplicates for each key, shows that the data actually does not contain that many duplicates.

      I know we had this conversation about duplicates in 4.x Re: Clarification on 4.x handling of duplicates

      Is the increase because we have the duplicate tree for all keys in 4.1.17, immaterial of whether or not they actually have duplicate values?
      Any idea what's going on here? Any help to go about debugging this is greatly appreciated.

      PS : I will get the DBPrintLog output in a while.

        • 1. Re: Data Bloat in je 4.1.17
          Hi Vinoth,

          There is no difference in the way duplicates are handled between JE 4.0 from 4.1. The only real change was in JE 5.0. I suggest looking for other things that may have changed when you upgraded, as a starting point.

          • 2. Re: Data Bloat in je 4.1.17
            I kind of suspect its creating the duplicate tree for every LN. The number of LNs and DINs and DBINs kind of match up, that leads me to believe this.

            Full log below..

            type,total count,provisional count,total bytes,min bytes,max bytes,avg bytes,entries as % of log
            key bytes,,,4361532885,,,,0.5058740175065757
            data bytes,,,157753659939,,,,18.297117053539786

            Total bytes in portion of log read: 862,177,683,388
            Total number of entries: 232,253,571
            • 3. Re: Data Bloat in je 4.1.17
              Hi Mark,

              The changes contain just the new BDB version + some monitoring code So, I am sure nothing else has changed.

              • 4. Re: Data Bloat in je 4.1.17
                It looks like around 2/3 LNs has a duplicate set. I thought you had said that was normal.

                If you can create identical data sets with 4.0 and 4.1, and do the printlog summary for both, it would be useful to compare them. Alongside this, it would be good to know how many true duplicates there are, for both data sets.

                • 5. Re: Data Bloat in je 4.1.17
                  Hi mark,

                  Ratio of DINs to DBINs is 32665083.0 / 37469028.0 = 87% . This is way too high.. Maximum duplicate percentage we have is usually around 20% max. And the disk usage keeps moving up as we speak.

                  Not sure.

                  • 6. Re: Data Bloat in je 4.1.17
                    This isn't enough info for me to conclude anything. Comparing the details for your 4.0 and 4.1 run would tell more. I'm not saying that there is nothing wrong, only that this is not enough info to figure out what is happening.
                    • 7. Re: Data Bloat in je 4.1.17
                      Hi Mark,

                      for every Voldemort put, we do the following BDB operations

                      // read + delete to delete old ones
                      for k in getSearchkey()
                      // put
                      cursor.put(k, v)

                      I did a quick test with inserting 1000 keys and then updating them rightaway (all serially). I printed log to make sure server was deleting for every key.
                      With 4.1.17, I see the following
                      type,total count,provisional count,total bytes,min bytes,max bytes,avg bytes,entries as % of log
                      key bytes,,,384,,,,0.5873355766289385
                      data bytes,,,4184,,,,6.3995105536861425

                      Total bytes in portion of log read: 65,380
                      Total number of entries: 1,147

                      With 4.0.92, I see the following

                      type,total count,provisional count,total bytes,min bytes,max bytes,avg bytes,entries as % of log
                      key bytes,,,384,,,,1.6784683975872017
                      data bytes,,,4184,,,,18.288311915377218

                      Total bytes in portion of log read: 22,878
                      Total number of entries: 547

                      You can notice that the DBIN and DINs are not present. It could be a problem with the way we call BDB apis too. I will be really glad to provide more information you need

                      • 8. Re: Data Bloat in je 4.1.17
                        The much more bizzare thing is -- as I do one more update, I see one more DBIN and DIN using printLog.
                        When I iterate using searchDupKey(), it just returns one value. I have a small BDB program which has this reproduced.
                        I can send you the code if you want to take a look.

                        Thanks a lot mark
                        • 9. Re: Data Bloat in je 4.1.17
                          More information.

                          1. I don't see the DBINs and DINs when I just do a bunch of inserts. Its only after the updates
                          2. As well, if I turn off sortedDuplicates, I don't see this issue (obvious, but trying to narrow it down)
                          • 10. Re: Data Bloat in je 4.1.17

                            Your evidence is very convincing, thanks for collecting it.

                            This could be a side effect of a bug fix for slot reuse in a duplicates DB. Slot reuse occurs when you delete and then insert a key. For example, see [#17252] and [#19026] in the JE 4.1 change log. I haven't looked at these fixes yet to see if they could have such a side effect, but I will.

                            It would help if you could send me the source code for the small test you mentioned, so I can see exactly what JE methods you're using.

                            • 11. Re: Data Bloat in je 4.1.17
                              The cause is almost certainly this fix:

                              Fixed a bug that causes a EnvironmentFailureException LOG_FILE_NOT_FOUND when using a particular sequence of operations with a Database configured for duplicates (DatabaseConfig.setSortedDuplicates(true)) or a DPL secondary key with a MANY_TO_ONE or MANY_TO_MANY relationship. The sequence of operations that causes the bug is:

                              A single record exists (no duplicates) for key A
                              New transaction T1 starts
                              Key A is deleted in T1
                              Two or more duplicates are inserted for key A in T1
                              The T1 transaction aborts

                              Later, the log file containing the original version of the record is cleaned and deleted, causing an exception when that record is subsequently accessed. An example stack trace is below.

                              (JE 4.0.92) var/db/bdb_userNode fetchTarget of 0x3151/0x44b2638 parent
                              IN=729921304 IN class=com.sleepycat.je.tree.DBIN
                              lastFullVersion=0x3189/0x13616b0 parent.getDirty()=false state=0
                              LOG_FILE_NOT_FOUND: Log file missing, log is likely invalid. Environment is
                              invalid and must be closed.
                              at com.sleepycat.je.tree.IN.fetchTarget(IN.java:1241)
                              at com.sleepycat.je.tree.BIN.fetchTarget(BIN.java:1300)
                              Thanks to "helg" on OTN for reporting this and working with us to diagnose the problem. [#17252] (fixed in 4.0.114)

                              The SR # is wrong in the change log, it should be [#18937]. Note that this fix is in 4.0.114, so if you were to upgrade to the latest 4.0 release you'd have the same problem.

                              Of course, the bug described above is not something you're running into right now, but the fix was necessary. The previous behavior was incorrect. For details on the internal change (if you're interested) see:
                              src/com/sleepycat/je.tree/Tree.java -- search for [#18937].
                              test/com/sleepycat/je/DupSlotReuseTest.java -- the test for this fix, contains a detailed description.

                              The new/correct behavior, implemented with this fix, is that a txn that does a delete followed by an insert, with the same key but different data, will add a new slot rather than reusing the slot for the deleted record, meaning that a DIN/DBIN will be created if one does not already exist. This is only true for a duplicates DB of course.

                              A potential workaround is to add a duplicate comparator (DatabaseConfig.setDuplicateComparator) that compares only the Voldemort version, which I assume is embedded in the data. This works if only the version is required to distinguish two records with the same key. And if a delete followed by an insert will typically use the same version for the old and new record, i.e., I am assuming that if only one version for a key exists, version 0 (or some other constant) is always used. This is a complex workaround and risky. It will require re-loading all data if the sort order for duplicates will change (and I assume it will). It will take more thought and much testing.

                              I don't believe there are any simpler workarounds, but I'll keep thinking.

                              When we reworked the implementation of duplicates in JE 5, one of the motivations was to remove all this complexity, and we did: DINs and DBINs no longer exist. I know I sound like a broken record, but I don't believe you're going to stop having problems with duplicates until you move to JE 5.

                              • 12. Re: Data Bloat in je 4.1.17
                                On 07/03/2012 11:49 PM, Vinoth Chandar wrote:
                                The version will never be the same for a delete followed by an insert. So, I don't think this will work.
                                I was afraid of that.
                                I am pretty sure the problem is related to the transaction related fix you mentioned in JE forums. I tried placing the deletes and inserts in two separate transactions and the bloat seems to be not happening. Is it possible to have a fix for this rolled out? If yes, would it possibly clean up all the extra DBINs/DINs currently in the log too? Please let us know.
                                No, it is not possible to change this behavior, because it is not a bug. It is the correct behavior with duplicates in JE 4.1. If we changed this behavior, it would cause other problems, such as the problem with aborts described in the original bug fix. The incorrect behavior may have resulted in better performance, in the particular case where you are doing the delete and insert in a single txn, but it could result in complete data loss for the environment (LOG_FILE_NOT_FOUND, as in the case of the abort bug). We cannot risk data loss.

                                Can you use two separate txns for the delete and insert? That's what I was just about to suggest, since it is another workaround. The drawback of course is that the update won't be atomic -- if you crash in between the two transactions, the original data will remain deleted.

                                • 13. Re: Data Bloat in je 4.1.17
                                  To give you more information about this situation, I'm going to try to describe more about what is happening in the Btree when you run your test.

                                  Before the txn starts, you have a single LN in a BIN. Let's say this LN has key K1 and data D1. There is no DIN or DBIN for K1.

                                  The txn starts. You delete the LN with K1/D1. However, the slot in the BIN cannot be deleted until the txn ends (this is fundamental to JE's transaction architecture). So the BIN slot with K1 is marked deleted, but is still present.

                                  Then you insert another LN with the same key but different data, say K1/D2. While the txn is open, this second LN must coexist with the first LN, because the data is different. In a duplicates database, you cannot overwrite a record with different data, since the data is part of what uniquely identifies the record.

                                  In order for two slots with key K1 to coexist, the DIN and DBIN must be created. This is what you're seeing.

                                  (The bug that was fixed [#18937] was that the K1/D2 slot was incorrectly being reused/overwritten by the insert.)

                                  • 14. Re: Data Bloat in je 4.1.17
                                    Hi Mark,

                                    Thanks for the explanation.
                                    Can you use two separate txns for the delete and insert?
                                    I am afraid we cannot. The vector clock maintenance is the critical piece of Voldemort's consistency model. So, this is not really an option.
                                    No, it is not possible to change this behavior, because it is not a bug.
                                    I am sorry to disagree. Your explanation described why a DIN and DBIN must be created even if K does not have duplicates. But I still don't understand why DBIN and DINs must be continuously created for every such update and never cleaned up. In other words, N transactions like these produce N DINs/DBINs. This will just fill up the disk.

                                    However, I am working handling duplicates completely in voldemort code, transparent to JE. (there went my July 4th).
                                    I would like to know about the following.

                                    -- How can we make the DINs and DBINs go away? Would switching back to 4.0.92 auto fix the bloat we currently have?
                                    -- Can we get a patch to make the bloat go away?

                                    1 2 Previous Next