0 Replies Latest reply: Feb 15, 2012 1:44 PM by 917772 RSS

    Berkeley XML DB Performance

    917772
      We need help with maximizing performance of our use of Berkeley DB XML.

      *1. Describe the Performance area that you are measuring? What is the current performance? What are your performance goals you hope to achieve?*

      I'm using the Berkeley DB XML to insert and query XML in a stream processing system that I'm developing. Multiple processes accessing the same container concurrently.
      However, it takes longer to insert the XMLs in the database. I made a test program to sum the time spent on calls to the method "putDocument" to insert 20.000XMLs (320.000 bytes), I got 81s. To save them in an XML file on disk, just 0.035s were spent. How can I reduce the insertion time?

      *2. What Berkeley DB XML Version? Any optional configuration flags specified? Are you running with any special patches? Please specify?*

      Version dbxml-2.5.16
      No special patches.

      *3. What Berkeley DB Version? Any optional configuration flags specified? Are you running with any special patches? Please Specify.*

      Version db-4.8.26
      No special patches.

      *4. Processor name, speed and chipset?*

      Intel(R) Core(TM)2 Duo CPU E7500 @ 2.93GHz

      *5. Operating System and Version?*

      Ubuntu 10.04 LTS- the Lucid Lynx.
           
      *6. Disk Drive Type and speed?*

      Don't have that information

      *7. File System Type? (such as EXT2, NTFS, Reiser)*

      EXT3

      *8. Physical Memory Available?*

      3G

      *9. Are you using Replication (HA) with Berkeley DB XML? If so, please describe the network you are using, and the number of Replica’s.*

      No.

      *10. Are you using a Remote Filesystem (NFS) ? If so, for which Berkeley DB XML/DB files?*

      No.

      *11. What type of mutexes do you have configured? Did you specify –with-mutex=? Specify what you find inn your config.log, search for db_cv_mutex?*

      None.

      *12. Which API are you using (C++, Java, Perl, PHP, Python, other) ? Which compiler and version?*

      C++
      Compiler: g++ Version: 4:4.4.3-1ubunt


      *13. If you are using an Application Server or Web Server, please provide the name and version?*

      No

      *14. Please provide your exact Environment Configuration Flags (include anything specified in you DB_CONFIG file)*

      u_int32_t envFlags = DB_CREATE|DB_INIT_MPOOL;
      u_int32_t envCacheSize = 64*1024*1024;
      int dberr;
      DB_ENV *dbEnv = 0;
      dberr = db_env_create(&dbEnv, 0);
      if (dberr == 0) {
           dbEnv->set_cachesize(dbEnv, 0, envCacheSize, 1);
           dberr = dbEnv->open(dbEnv, path2DbEnv.c_str(), envFlags, 0);
      }

      *15. Please provide your Container Configuration Flags?*

      XmlManager db(dbEnv, DBXML_ADOPT_DBENV | DBXML_ALLOW_EXTERNAL_ACCESS);
      XmlContainerConfig config;
      config.setAllowValidation(true);
      config.setContainerType(XmlContainer::WholedocContainer);
      XmlContainer container = db.createContainer(theContainer, config);
      XmlUpdateContext updateContext = db.createUpdateContext();
      XmlIndexSpecification idxSpec = container.getIndexSpecification();
      idxSpec.setAutoIndexing(false);
      container.setIndexSpecification(idxSpec, updateContext);

      *16. How many XML Containers do you have? For each one please specify:*
      One.

           *1. The Container Configuration Flags*

           XmlManager db(dbEnv, DBXML_ADOPT_DBENV | DBXML_ALLOW_EXTERNAL_ACCESS);
           XmlContainerConfig config;
           config.setAllowValidation(true);
           config.setContainerType(XmlContainer::WholedocContainer);
           XmlContainer container = db.createContainer(theContainer, config);
           XmlUpdateContext updateContext = db.createUpdateContext();
           XmlIndexSpecification idxSpec = container.getIndexSpecification();
           idxSpec.setAutoIndexing(false);
           container.setIndexSpecification(idxSpec, updateContext);

           *2. How many documents?*
           Many documents.

           *3. What type (node or wholedoc)?*
           whole     

      *18. What is the rate of document insertion/update required or expected? Are you doing partial node updates (via XmlModify) or replacing the document?*

      Since we are using the database to persist a stream, the access pattern should be bursty, with high demand/low demand alternating peaks. No.


      *21. Are you running with Transactions? If so please provide any transactions flags you specify with any API calls.*
      No


      *23. Do you use AUTO_COMMIT?*
      No.


      *26. Please include a paragraph describing the performance measurements you have made. Please specifically list any Berkeley DB operations where the performance is currently insuffici*ent.

      I made a test program to sum the time spent on calls to the method "putDocument" to insert 20.000XMLs. Code bellow:
      ...
      double temp = 0;
      for(int j = 0; j < 20000; j++){
           myXMLDoc.setContent( document1);
           struct timeval t1;
           struct timeval t2;
           gettimeofday(&t1, NULL);
           container.putDocument(myXMLDoc, updateContext, DBXML_GEN_NAME);
           gettimeofday(&t2, NULL);
           temp += (t2.tv_sec - t1.tv_sec) + (t2.tv_usec - t1.tv_usec) * 0.000001;
      }
      ...

      *29. Are there any other significant applications running on this system? Are you using Berkeley DB outside of Berkeley DB XML? Please describe the application?*
      No.No.


      Thanks,
      Ana Paula