This content has been marked as final. Show 3 replies
I believe the problem you are experiencing is because you're trying to put all of your nodes into one single XML document. In my experience, you'll get the performance you're looking for if you break up your document into as many "smaller" nodes/documents as you can and do your node insertion that way; conversely, if you need the entire document out of the database, you can use XQuery to stitch it back together - think of it as XML database "normalization" :)
Hope that helps,
Thanks a lot for your reply! However, I have a few more questions.
Given that the xml is about 2.5MB size, is it considered to be that big for Berkeley DB so it needs to be break down into smaller documents?
Also, after having executed ~4000 xquery update insert operations and doubled xml’s size, the performance it’s really getting worse… An insertion may even take 1.5 min, when for each of the first 3000 insertions only less than 0.5 sec is needed… Is there something I am missing in the configuration of Berkeley DB? If I set autoindexing off and try to maintain fewer indexes, is it possible to see significantly better performance? Till now I am just getting the following error when I set my indexes and try to execute consequent insertions over the same node:
Exception in thread "main" com.sleepycat.dbxml.XmlException: Error: Sequence does not match type item() - the sequence does not contain items [err:XUDY0027], <query>:1:1, errcode = QUERY_EVALUATION_ERROR+
at com.sleepycat.dbxml.dbxml_javaJNI.XmlQueryExpression_execute__SWIG_1(Native Method)
It's not that the file size is too big, I think it has more to do with your intent or what you're trying to do with the document. If you need to perform thousands of insert/updates into a database that only contains a single document, you are not leveraging the benefits of using the database to begin with. Your mileage may vary, and others may disagree.
That said, a couple of other things you might want to look at given your existing architecture:
- make sure all statistics are turned off
- use the WholeDocContainer type instead of the node type
- turn off auto indexing (which you've already done) and make sure you're only indexing the nodes/attributes/metadata you need to be indexing