12 Replies Latest reply on Mar 20, 2019 4:22 PM by Greybird-Oracle

    Share cleaner threads for multiple environments

    Nitin.Patel-Oracle

      Hi,

       

      I believe that each open environment will have its own set of cleaner threads which perform compaction. Our application (Oracle Unified Directory) opens a large number of environments (for isolation etc), resulting in too many cleaner threads, and we wanted to see if this can be avoided. Hence the question - Is there a way to share these cleaner threads across all the environments (similar to shared cache)?

       

      Any help is highly appreciated.

       

      Thanks

      Nitin

        • 1. Re: Share cleaner threads for multiple environments
          Greybird-Oracle

          Hi Nitin,

           

          There is no built-in way to share cleaner, checkpointer, evictor and IN compressor threads among JE Environments in a single JVM process and I don't recommend large numbers of JE Environments per JVM process. The only way to share threads is to disable the JE built-in threads (by configuring EnvironmentConfig.ENV_RUN_CLEANER, etc, to false) and use a thread in your application to periodically call Environment.cleanLog(), checkpoint(), evict() and compress().

           

          --mark

          • 2. Re: Share cleaner threads for multiple environments
            Nitin.Patel-Oracle

            Hi Mark,

             

            Thanks for the reponse. Few follow-up questions:

            1. What is the maximum number of JE environments per JVM that you would recommend? I know this might be subjective, but if you have any number, that would really help.

            2. We can explore the approach that you are suggesting, i.e., have common cleaner/compressor threads at application level.

            3. Currently, to be able to scale to say 100 tenants, we were thinking of isolating them using different JE environments, each having the same tables/databases. Do you have any better approach in mind which we should follow instead?

             

            Thanks

            Nitin

            • 3. Re: Share cleaner threads for multiple environments
              Greybird-Oracle

              I don't have any numbers on maximum JE envs per JVM, but I can tell you that in NoSQL DB which is a sharded, multi-tenant server product using JE, we use one env per JVM process. In addition to the issue with threads, multiple envs don't perform as well sharing a single disk, or a spinning disk anyway since head will be moving during writes from multiple envs. There is also the issue of replication. I know you're not using JE HA but you may want to consider resource usage for LDAP replication with various approaches.

               

              There are many ways to separate tenant data. NoSQL DB uses logical tables for each tenant, using a key prefix for the table ID. Another approach is to use separate JE Databases for each tenant; this is probably the simplest approach.

               

              --mark

              • 4. Re: Share cleaner threads for multiple environments
                Nitin.Patel-Oracle

                Hi Mark,

                 

                Thanks for providing insights, based on that I think we would park the idea of using one database env per tenant.

                 

                Today, OUD defines a set of JE Databases, and we were planning to utilize the same JE databases for each tenant in MT setup, in different JE Environments. That way, we could easily move tenants around from instance to another by just moving db files. Not sure if we can easily move tenants around with a shared JE Environment, because all data pertaining to all tenants may end up into a single jdb file?

                 

                Regarding the approaches, just to make sure I have got it right (due to terminology):

                "NoSQL DB uses logical tables for each tenant, using a key prefix for the table ID" - If there is a NOSQL "table" called EMPLOYEES, you mean NOSQL internally creates/opens different JE Databases such as TENANT1_EMPLOYEES, TENANT2_EMPLOYEES etc?

                 

                "separate JE Databases for each tenant" - By separate databases, do you mean separate JVMs by any chance?

                 

                Thanks

                Nitin

                • 5. Re: Share cleaner threads for multiple environments
                  Greybird-Oracle

                  Yes, with a single env per JVM (which I recommend), moving tenant data to different machines will require copying the individual records rather than just copying jdb files.

                   

                  No, NoSQL DB stores the data for multiple tables in a single JE Database, not a JE Database per table. The key prefix of the record identifies the table.

                   

                  By separate JE Databases, I do not mean JVMs or envs, I mean separate Databases in a single env.  However, if you have too many Databases per env with lots of small tenants, you may run into performance problems due to the per-Database overhead in JE. You will need to consider the maximum number of tenants per JVM and do tests to measure the overhead.

                   

                  I'm on vacation right now, so I may not reply back quickly.

                   

                  --mark

                  • 6. Re: Share cleaner threads for multiple environments
                    Nitin.Patel-Oracle

                    Hi Mark,

                     

                    We have tried the approach where we have a single JE env to store the data for all the tenants, and would like to know your thoughts on the same.

                     

                    In this approach, we have used separate set of tables for each tenant, so we have around 3500 JE Databases overall, and the total (file-system) size of the entire data is around 10GB. We have compared the performance of this Multi-tenant system against a vanilla OUD instance which has only 35 JE Databases, with similar data size of around 10GB. And, we have found that the performance of both these systems is comparable, i.e., the application throughputs are almost same. So, it seems like there is not much overhead added by more number of JE Databases.

                     

                    Also note that, this multi-tenant solution/design is primarily meant for free/trial customers. So, we will have control on the number of tenants can be supported on a JVM and the data each tenant can load and query.

                     

                    So, do you think this approach can work for us?

                     

                    PS: Based on the below architecture whitepaper, with increasing number of JE databases, the overhead seems to be in looking up the Mapping tree to identify the B+Tree given the Database name. Request you to clarify/correct if there could be any other overhead with more databases:

                    https://www.oracle.com/technetwork/database/database-technologies/berkeleydb/learnmore/bdb-je-architecture-whitepaper-36…

                     

                    Thanks

                    Nitin

                    • 7. Re: Share cleaner threads for multiple environments
                      Greybird-Oracle

                      Nitin,

                       

                      This approach sounds good to me. But when comparing the single and multi-tenant runs, in addition to disk space and throughput, also please look at the cache size: the CacheTotalBytes stat. The per-database overhead not mentioned in the whitepaper is that disk utilization information is stored in memory. There is one map entry per-database and per-file in which the database's records appear. The map entry contains a fairly small object, but if the databases are spread across all files and there are a large number of files, this memory overhead is something to account for. Your test should show this difference.

                       

                      --mark

                      • 8. Re: Share cleaner threads for multiple environments
                        Greybird-Oracle

                        Nitin,

                         

                        To ensure that you are testing the worst case scenario, alternate writes among the entire set of Databases/tenants. This way the records for all Databases are spread across all files.

                         

                        --mark

                        • 9. Re: Share cleaner threads for multiple environments
                          pranjal_ranjan-Oracle

                          Mark,

                           

                                     I am collaborating on the same thing. We tried to alternate writes among the entire set of tenants, so that each tenant's records are spread across the whole set of files. For configuration purpose, we kept the db-local-backend-workflow-element cache size to only 1% and JVM heap size to 2GB to check the performance while relying on the filesystem. We checked that the CacheTotalBytes stat was around 20mb. The performance numbers in this test look stressed. etime parameter for a search result returning only one entry as a result , was around 200-300 as well, which is almost 100 times the etime in a different OUD setup with a significantly lesser number of tables. Is this sort of a result expected considering the tuning? Also, does the cache ( which is set to 1% ) also contain the disk utilization ratio map you mentioned earlier?

                           

                          -Pranjal

                          • 10. Re: Share cleaner threads for multiple environments
                            Greybird-Oracle

                            I don't understand the purpose of this test. I thought you were trying to compare the use of many databases (multi-tenant) to your normal product.

                             

                            Configuring a tiny JE cache and looking at latency doesn't seem meaningful to me. Setting the JE cache size to 20 MB in a 2 GB heap is not making use of the memory in the heap and 20 MB is ridiculously low.

                             

                            I suggest comparing realistic configurations with the normal and multi-tenant approaches, and look at memory and disk usage to see the overhead added by having many JE databases. I thought that's what Nitin was doing. It seems to me that you are not in sync with Nitin. This makes it extremely difficult for me to support you.

                             

                            Also, to measure memory usage you will need to look at the eviction stats as well, when comparing runs. If the cache is too small to hold the data set, eviction may occur. To compare memory usage in the two approaches by looking at CacheTotalBytes, there must be no eviction. you need to set the JE cache to as large a value as possible -- large enough so that no eviction occurs.

                             

                            --mark

                            • 11. Re: Share cleaner threads for multiple environments
                              Nitin.Patel-Oracle

                              Hi Mark,

                               

                              Appreciate your response.

                               

                              I think we are on the same page here, that we were trying to analyze the memory and disk overheads introduced by large number of tables. The intention of testing with low BDB cache was to actually test the disk overhead, since we might not be able to analyze disk overhead if entries are always returned from BDB cache. Please correct me if my understanding is wrong, and pls suggest the ideal way to test/analyze disk overhead. Agree with you that 1% is too less though.

                               

                              To compare the normal(35 tables) and multi-tenant(3500 tables) OUD configurations, we have loaded both setups with similar kind of data, and allocated reasonable amount of JE cache (cacheTotalBytes: ~750MB which is 40% of heap). With similar queries being performed on both, the throughputs (number of entries returned) are almost the same in both the cases, so we seem to be good here.

                                   Since the actual data size on disk is quite huge (>5GB) compared to cache size, we are seeing cache evictions happening in both setups. However, the multi-tenant setup has more cache being evicted (nBytesEvictedCRITICAL: ~300MB), compared to the other setup(nBytesEvictedCRITICAL: ~45MB). Not sure why this is the case though, and if this should be a concern for us.

                               

                              Thanks

                              Nitin

                              • 12. Re: Share cleaner threads for multiple environments
                                Greybird-Oracle

                                On critical eviction:

                                 

                                https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentStats.html#cacheCriticalEviction

                                 

                                The key sentence is "Critical eviction is performed in-line in the thread performing the CRUD operation, which is very undesirable since it increases operation latency." I'm surprised that your average latency (response time) did not go down with critical eviction. I strongly suspect there are much higher latency spikes, e.g., in the 95% or 99% latencies. Do you really only care about the average?

                                 

                                The other thing is that more cache _is_ being used with more databases, due to the per DB overhead and indicated by the critical eviction. How much more? One way to find out is to make the cache large enough so that no eviction occurs, in both runs, and then look at cacheTotalBytes. Another (simpler) way is to compare the value of dataAdminBytes, since this is specifically the per-database overhead:

                                 

                                https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentStats.html#getDataAdminBytes--

                                 

                                On disk usage:

                                 

                                Now that I know that eviction is occuring I can say there may be two factors contributing to a difference in disk usage between the two runs:

                                 

                                1. Eviction of dirty nodes will write to the log, and disk usage will be higher. As I say, sizing the cache large to avoid this is the recommended course of action. The nDirtyNodesEvicted stat will indicate whether this is happening. See:

                                 

                                https://docs.oracle.com/cd/E17277_02/html/java/com/sleepycat/je/EnvironmentStats.html#cacheEviction

                                 

                                """

                                Large values for NDirtyNodesEvicted or OffHeapDirtyNodesEvicted indicate that the cache is severely undersized and there is a risk of using all available disk space and severe performance problems. Dirty nodes are evicted last (after evicting all non-dirty nodes) because they must be written to disk. This causes excessive writing and JE log cleaning may be unproductive.

                                """

                                 

                                2. With no eviction there will still be additional disk usage due to the per-Database overhead. This is actually what I thought you were measuring, because I didn't know about eviction. To measure this, ensure that no dirty nodes are being evicted and compare the runs.

                                 

                                You should not set the cache size artificially low to test overheads. Set all parameters to be as close as possible to what you would expect in a deployment.