6 Replies Latest reply: Oct 16, 2012 6:36 PM by user738616 RSS

    Size of an index

    965779
      I suspect the size of an index of a NamedCache depends on the following:
      1) Number of entries in the cache
      2) The number of unique indexed attributes
      3) Size of each indexed attribute

      Is there a formula to arrive at the total size of an index?
        • 1. Re: Size of an index
          user738616
          Hi,

          The Index Size should be ( (100 * M) + (56 * N) + (M * A) ) * I bytes

          where,
          M= Distinct entries
          N= Number of objects
          A= Average Attribute Size (bytes)
          I = Indexed Attributes

          100 * M includes forward map entries
          56 *N includes forward and reverse map entries distributed equally between them
          M*A includes stored attribute value in reverse map

          Definitely, I would add 10% to the above formulae as the hash tables may not be filled 100%.

          HTH

          Cheers,
          _NJ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                   
          • 2. Re: Size of an index
            robvarga
            user738616 wrote:
            Hi,

            The Index Size should be ( (100 * M) + (56 * N) + (M * A) ) * I bytes

            where,
            M= Distinct entries
            N= Number of objects
            A= Average Attribute Size (bytes)
            I = Indexed Attributes

            100 * M includes forward map entries
            56 *N includes forward and reverse map entries distributed equally between them
            M*A includes stored attribute value in reverse map

            Definitely, I would add 10% to the above formulae as the hash tables may not be filled 100%.

            HTH

            Cheers,
            _NJ
            I think this overly simplifies the situation.

            You should always measure your index sizes with real data and the real architecture, as index size depends also on whether you use a 32-bit or 64-bit JVM and whether that 64-bit JVM uses compressed object references (which on the other hand imposes a limit on max heap size).

            Also, starting with 3.6 you can plug in your own index implementations, and you can even configure or override out-of-the-box indexes in a way which affects their memory usage, e.g. ordered index consumes more memory, and you can influence index storage quite significantly with using ConditionalIndex instead of SimpleMapIndex.

            Bottom line: measure with real data and real configuration in a production-like environment.

            Best regards,

            Robert
            • 3. Re: Size of an index
              965779
              How does one arrive at the numbers 100*M and 56*N?

              Thanks for your help
              • 4. Re: Size of an index
                user738616
                Hi,
                962776 wrote:
                How does one arrive at the numbers 100*M and 56*N?

                Thanks for your help
                Here is a detailed breakup:

                100*M = Forward Map {4M for hash table references + 24M for HashMap$Entry + 16M for SafeHashSet + 56M for SafeHashMap used by SafeHashMap}
                56*N = Forward Map {4N for references of hashSet and 24N fro HashMap$Entry} + ReverseMap {4N references + 24N for HashMap$Entry}

                HTH

                Cheers,
                _NJ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                       
                • 5. Re: Size of an index
                  user738616
                  robvarga wrote:
                  user738616 wrote:
                  Hi,

                  The Index Size should be ( (100 * M) + (56 * N) + (M * A) ) * I bytes

                  where,
                  M= Distinct entries
                  N= Number of objects
                  A= Average Attribute Size (bytes)
                  I = Indexed Attributes

                  100 * M includes forward map entries
                  56 *N includes forward and reverse map entries distributed equally between them
                  M*A includes stored attribute value in reverse map

                  Definitely, I would add 10% to the above formulae as the hash tables may not be filled 100%.

                  HTH

                  Cheers,
                  _NJ
                  I think this overly simplifies the situation.

                  You should always measure your index sizes with real data and the real architecture, as index size depends also on whether you use a 32-bit or 64-bit JVM and whether that 64-bit JVM uses compressed object references (which on the other hand imposes a limit on max heap size).

                  Also, starting with 3.6 you can plug in your own index implementations, and you can even configure or override out-of-the-box indexes in a way which affects their memory usage, e.g. ordered index consumes more memory, and you can influence index storage quite significantly with using ConditionalIndex instead of SimpleMapIndex.

                  Bottom line: measure with real data and real configuration in a production-like environment.

                  Best regards,

                  Robert
                  Robert,

                  I agree with you but using the calculation does allow to provision your memory requirements.

                  Cheers,
                  _NJ                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                               
                  • 6. Re: Size of an index
                    robvarga
                    user738616 wrote:
                    robvarga wrote:
                    user738616 wrote:
                    Hi,

                    The Index Size should be ( (100 * M) + (56 * N) + (M * A) ) * I bytes

                    where,
                    M= Distinct entries
                    N= Number of objects
                    A= Average Attribute Size (bytes)
                    I = Indexed Attributes

                    100 * M includes forward map entries
                    56 *N includes forward and reverse map entries distributed equally between them
                    M*A includes stored attribute value in reverse map

                    Definitely, I would add 10% to the above formulae as the hash tables may not be filled 100%.

                    HTH

                    Cheers,
                    _NJ
                    I think this overly simplifies the situation.

                    You should always measure your index sizes with real data and the real architecture, as index size depends also on whether you use a 32-bit or 64-bit JVM and whether that 64-bit JVM uses compressed object references (which on the other hand imposes a limit on max heap size).

                    Also, starting with 3.6 you can plug in your own index implementations, and you can even configure or override out-of-the-box indexes in a way which affects their memory usage, e.g. ordered index consumes more memory, and you can influence index storage quite significantly with using ConditionalIndex instead of SimpleMapIndex.

                    Bottom line: measure with real data and real configuration in a production-like environment.

                    Best regards,

                    Robert
                    Robert,

                    I agree with you but using the calculation does allow to provision your memory requirements.

                    Cheers,
                    _NJ
                    On a 32 bit JVM, with non-sorted indexes, and I haven't actually verified the numbers... as I said, that may be true for a certain case, but certainly not universally true.

                    Best regards,

                    Robert