6 Replies Latest reply: Mar 2, 2012 6:40 PM by 802907 RSS

    cachesize or cachememsize

    869208
      I tested the DSEE 7 with 2Mil objects, and had the dbcachesize as 5G and entrycache as 500MB. I get a cachehitration of 97, this is puzzling me. I understand that all the data which is requested is read from the entrycache rather than the dbcache, the entrycache should be more and dbcachesize should be less to get better performance and hitratio. But I see the reverse here, even etime response is excellent.
        • 1. Re: cachesize or cachememsize
          841083
          There is a bit more to tuning the caches than just throwing memory at them.

          Entries in the entry cache will be returned if the entry exactly matches what is requested, and if no changes have happened to the entry.
          Also keep in mind that entries in the entry cache are significantly bigger than the entry in the database cache, so the same amount of memory in the entry cache will hold less entries.

          Next add in the fact that the memory you specify is includes cache overhead. The DB cache is just a static chunk of memory, but the entry cache is a data structure on the heap, so it includes the data structire overhead and heap fragmentation. All this boils down to the entry cache probably holding fewer entries than you think it will.

          Adding more memory can be counter-productive. The overhead in processing the data structure can get to the point where it outweighs any advantage that the cache might give.

          How useful the entry cache is is heavily dependent upon your traffic patterns. Given that for anything other than a trivially small database you will not hold all of the entries in the entry cache, if your usage pattern is such that an entry will not be re-accessed for a significant period of time, an the access pattern is random over the database (typical authentication pattern) an entry cache is likely to be of little (or no) value.

          However, if an entry is accessed multiple times in fairly rapid succession, and if you have a typically fairly small working set of entries, and can size the entry cache to hold the working set, it can make a noticeable difference.

          In reality you will almost never see much better than a 97% hitrate. As long as it is not around the 60% or lower range, its not (in general) worth worrying about.
          • 2. Re: cachesize or cachememsize
            802907
            OP, if you want feedback about your specific test, it would be useful if you could share more specifics about your systems (hw, ram, disk configurations), and your workload. You should be instrumenting your test systems as well to identify any bottlenecks in physical resources. If you can paste a bit more detail into the thread, we might be able to give you a better explanation of what you are seeing.

            I can add a few comments in addition to Phillip's.

            Benchmarks that measure performance of a database fully cached in the entry cache have consistently provided the fastest search response times under load (AFAIK). The size of an entry in the entry cache includes some runtime data structures the server needs in order to handle the entry. When the entry is retrieved from the database the server needs to create this data structure from scratch, which adds some processing time to the response and puts search response under any conditions other than the fully cached configuration at a distinct disadvantage if all you are going for is the fastest search response time possible.

            The cost of running fully cached in the entry cache would be significant with a large data set, as Phillip pointed out, though I don't think I'd go so far as to call 5 - 10 million entries (which can certainly be cached on a suitable system) "trivial". Earlier versions of DSEE (pre-7) were also significantly less efficient in their management of replication metadata. This had the unfortunate effect of increasing the size of entries (including the footprint of the entries in the entry cache), sometimes by a factor of two or more. What's worse, none of the cache sizing parameters took this extra size into account, so it was quite common for a replica to run the host system out of memory, despite the fact that the configuration of the entry cache should not have been able to cause that much memory to be allocated. It took quite a while for us to figure out that what we had was not a memory leak but rather an entry cache that was consuming far more memory than it was supposed to.

            But as long as you are using 7.0 or later, these sorts of problems shouldn't trouble you much. As far as the database cache is concerned, a large database cache can cause some problems as well. For one thing, database checkpoint locks pages in the database cache, which can cause a spike in response time - including search response time. With a large database cache, this spike can be quite noticeable, but it's pretty easy to spot because it happens on the interval of the db checkpoint.

            Another problematic effect of utilizing the database cache heavily is a sort of cascade effect access patterns can have. Since pages of the id2entry as well as index databases are all given the same priority in the database cache, it's possible for a burst of entries to flood the database cache and push out pages from index database files. If the access pattern on the indexes is heavy (it often is, with one or two attributes comprising the majority of search filters), the effect of these index pages getting pushed out of the database cache can be - again - a noticeable spike in response times as the index pages are physically read from disk.

            The other caching option to consider is not to rely on Directory server caches at all, but to use whatever cache capabilities the underlying file system gives you. This poses some risk that a rogue command may execute on the host and push your data out of cache. On the other hand, this option often gives the best performance all-around when your data is much larger than your available physical memory. It also allows you to keep the size of the slapd process much smaller than the other options. A small slapd process is much easier to manage, particularly if you need to use proctools, debuggers, or a core generation tool like gcore. Under normal conditions, this isn't a very big deal, but if you get into a troubleshooting situation, it can make a huge difference.

            Regarding the cache hit ratio, I have to say it's a fairly useless statistic as-is, since it's the aggregate of cachehits/cachetries over the entire uptime of the Directory server. What's useful is to see, at some regular interval (such as 5 - 10 seconds or perhaps a minute) the proportion of cachehits to cachetries over that specific interval. When you can see that statistic it becomes possible to use variations in the running cachehitratio to interpolate into the logs and to identify types of queries that affect the cache performance.
            • 3. Re: cachesize or cachememsize
              841083
              Agree with all that Chris mentioned.

              Personally, I wouldn't worry about using the OS filesystem cache method until you get over (say) 20M entries or so.
              Beyond that point the checkpointing pause can start to bite.

              If you have HUGE caches (like trying to cache 50M entries) and you ever have a core dump or try to debug an issue live, you are most definitely going to wish you had kept those caches a wee bit smaller :-)

              For your case (2M entries, wasn't it?) if you have the memory available, yes, you can cache everything, including in the entry cache.
              Get to 50M entries or so, and based upon my testing in the past, the entry cache becomes much too large, and really doesn't buy much (if anything).

              If you experiment with OS filesystem caching, don't reduce the DB cache too much, it starts to affect update performance.
              You need to run a few update benchmarks with different size DB caches, on the systems I last tested this on, going below 100M for the DB cache started to reduce update perf.

              But, for 2M entries, you shouldn't need to worry about this.

              If you have limited memory, start by caching as much of the DB+Indexes as you can.

              If you can cache it all, and have memory left, allocate it to the entry cache. Don't assume that the sixe you specify is the max size the entry cache will reach. 7.x is much better at respecting your config than earlier versions were, but other things use the heap and you can use more memory than you expect, so leave a little headroom. The absolute last thing you want to do is run out of memory ... DSEE will try to do a clean shutdown, but if it really is out of memory you will probably be looking at a corrupt DB.

              Maybe my definition of trivial needs some work... I have probably spent too much time working with people trying to cram hundreds of millions of entries into LDAP. You can do it, and get good results, but its nowhere near as easy as it is to handle 2M or even 10M entries.
              • 4. Re: cachesize or cachememsize
                stan25
                Do you have any document reference on how to use OS file system cache for Oracle Directory server? i know it is a bit generic question and i should test but i would like to know how to configure the OS file system cache first for Directory. if i go to my Unix engineer and their answer is 'get the document from Oracle'. If you have any examples or reference about OS file system cache settings. please pass it on.

                Thanks
                • 5. Re: cachesize or cachememsize
                  869208
                  I am coming to the same point, for a db size of 2.5mil, the ratio of cache hit and cache tries are good. The users are random and the application that will be using the ldap is also random as we cadre different set of users and everyone logins, in this case the repetition of users authenticating or being search is not same. Even then the ration looks good.
                  I am sure that working set of entries is not the same and small.

                  Not sure about db checkpoint interval, as this will relate more on to the write commit time. Not sure how this will impact the search etime.

                  Filesytem cache is fine, but the cache hit ratio would not take the filesystem cache in to account. But yes the filesystem cache with zfs would improve the performance. Again with version 7 compression of entries can be achieved for better caching, and this will have impact on uncompressing the entries having utilization of cpu going higher.

                  Cache hit ration would useless only if we dont consider other attributes like the repetitive searches seen in the logs, but my case it is all the time different users who are accessing the db.

                  Using all 20M entries to be in cache, we need to have a RAM size of 100G to the entry cache to have all the entries in memory and more including the datastructure that will be stored in the entry cache. As the users who login are not the same all the time.
                  • 6. Re: cachesize or cachememsize
                    802907
                    866205 wrote:
                    Not sure about db checkpoint interval, as this will relate more on to the write commit time. Not sure how this will impact the search etime.
                    No, it does not relate more to writes. It affects read operations on data in any dirty page in the DB cache. It actually shows up measurably against read operations in practice, because read responses are generally so much faster that the effect of the checkpoint is significant.
                    Filesytem cache is fine, but the cache hit ratio would not take the filesystem cache in to account. But yes the filesystem cache with zfs would improve the performance. Again with version 7 compression of entries can be achieved for better caching, and this will have impact on uncompressing the entries having utilization of cpu going higher.
                    Certainly, when you use filesystem cache, the DS cache management is not as critical. Entry compression has been shown to have a relatively minor cost in CPU and an extremely beneficial effect on I/O. I'd suggest running some performance tests yourself to see what benefit compression offers your deployment.
                    Cache hit ration would useless only if we dont consider other attributes like the repetitive searches seen in the logs, but my case it is all the time different users who are accessing the db.
                    The server-computed cache hit ratio is useless because it generally tells you next to nothing about what is happening in real time. Once your server has been up for a month or two, the number of cachetries is so huge, your cache performance could completely bottom out for minutes at a time with no visible effect on the hit ratio. All it will tell you is the average hit ratio over the entire uptime of the server. I don't know what you call that, but I call it useless, since I am not all that interested in such a non-specific statistic.
                    Using all 20M entries to be in cache, we need to have a RAM size of 100G to the entry cache to have all the entries in memory and more including the datastructure that will be stored in the entry cache. As the users who login are not the same all the time.
                    That seems pretty high. Have you verified this empirically?