1 2 Previous Next 21 Replies Latest reply on Feb 19, 2007 11:46 PM by 796440

    Ultra fast performance using primitives

    807606
      Hello,

      I need a way to store/get values in a hashtable of primitives.
      The key is an int, and the value is two shorts.

      Since this code will be called zillion of times and because the nature of the application is that it needs to perform in near real-time I need to find a way to do this as fast as possible. It is not only the get from the hashtable that should be fast, almost more importantly is how fast it is created.

      I exclude the standard Java Collections Framework because it does not allow me to work with primitives but forces me to work with objects. Object creation is something to be avoided for this type of application, as far as I understand.

      I'm looking at alternatives such as Primitive Collections for Java (http://pcj.sourceforge.net/) and GNU Trove (http://trove4j.sourceforge.net/).
      For example GNU Trove has an object that allows me to map integers to integers. (
      TIntIntHashMap
      ). This is pretty much what I'm looking for besides the fact that my value(s) is two shorts, not a single int.

      I can of course find myself an collection that allows the key to be a primitive integer while the value is an object. This was the value could be an array of the two shorts. However arrays are objects too in the Java world and so they have the overhead of object creation with them even if it is an array of primitives.

      To me the best solution would be to store the two shorts in the value part of an Int/Int collection. This should be possible I guess, but I do not know how to stuff two shorts into an int. Any bit manipulation experts know how to do this? I would need to know both how to pack them and how to unpack them.

      Any other ideas are also very welcome. Performance is what I'm looking for.

      Basically to make it work as fast as possible I understand I should follow these guidelines:

      * Use primitive datatypes
      * Avoid object creation
      * Pre-allocate memory

      Let me know if you can help on the bit manipulation issue mentioned above but I would also very much like to hear your thoughts on how to get optimal performance in a case like this.

      Thanks.
        • 1. Re: Ultra fast performance using primitives
          807606
          Minor disturbing typo error: "This was the value ..." should have been "This way the value...". Apologies.
          • 2. Re: Ultra fast performance using primitives
            807606
            >
            I exclude the standard Java Collections Framework
            because it does not allow me to work with primitives
            but forces me to work with objects. Object creation
            is something to be avoided for this type of
            application, as far as I understand.
            I would code it using the standard collections first. Then test it, if it still is not fast enough then search for ways to improve the performance. But I think you will be surprised at how fast object creation is in Java.
            • 3. Re: Ultra fast performance using primitives
              807606
              Since this code will be called zillion of times and
              How many times, exactly, is a "zillion"?

              And more important, what's the rate -- as in "3 zillion per second, where a zillion is actually 100,000"?

              Those are serious questions. Before even thinking of tuning performance, know your performance boundaries. Your exact performance boundaries.
              Object creation is something to be avoided for this type of
              application, as far as I understand.
              Perhaps. But how do you think a hashtable works?

              Any other ideas are also very welcome. Performance is
              what I'm looking for.
              How big are the key values? Simplest solution is just to use two arrays of shorts, and use the key as an index into the array. Limited primarily by the amount of main memory you have.

              Or implement your own hashtable-like data structure, using entries that contain both values.

              Basically to make it work as fast as possible I
              understand I should follow these guidelines:
              No. To make it work as fast as possible, you first make it work, using whatever techniques are easiest, then you profile, then you think about how to improve the pieces of code that are slow.
              • 4. Re: Ultra fast performance using primitives
                807606
                Since this code will be called zillion of times and
                How many times, exactly, is a "zillion"?
                Reminds me of the joke:

                Staffer: Mr President, today two Brazilian soldiers were killed in Iraq.
                Bush: Oh my God, this is terrible, this is awful. Wait... how much is a bazillion?
                • 5. Re: Ultra fast performance using primitives
                  JosAH
                  This is pretty much what I'm looking for besides the fact that my
                  value(s) is two shorts, not a single int.
                  Oh c'mon, that's nothing a little hacking couldn't solve; hint: an int
                  has 4 bytes; a short takes only two; wink wink nudge nudge.

                  kind regards,

                  Jos
                  • 6. Re: Ultra fast performance using primitives
                    800282
                    ...
                    Oh c'mon, that's nothing a little hacking couldn't solve; hint: an int
                    has 4 bytes; a short takes only two; wink wink nudge nudge.
                    Are you flirting with the OP again, Jos?
                    • 7. Re: Ultra fast performance using primitives
                      807606
                      I'm amazed. You guys are fast. Thanks.

                      As for performance: The part of the code that creates the hashtable will be called up to 500.000 times per second. The average table size will be between 10 and 40 elements.

                      I can understand why you say I must first try using stand techniques and standard Java Collections Framework and see how fast that really is, but I know that others before me (that have had a crack at this problem) have had the problem of not being able to tweak enough performance out of that particular part of the code. So I thought I would take another approach from the beginning.

                      Really appreciate your help.
                      • 8. Re: Ultra fast performance using primitives
                        807606
                        As for performance: The part of the code that
                        creates the hashtable will be called up to
                        500.000 times per second. The average table size will
                        be between 10 and 40 elements.
                        That's a lot. It's easy to create a fast hash-table implementation around a simple primitive array, and a reasonably fast processor could create and fill a that table within your 2 microsecond time bound (I tried a quick benchmark), but that doesn't leave much if any time for anything else to happen. How are you planning to get the data for this table? How are you planning to store it? How are you planning to process it?

                        I think you need to step back and do some overall application analysis. If you're really getting 500k samples per second, building a hashtable of those samples will be the least of your worries.
                        • 9. Re: Ultra fast performance using primitives
                          807606
                          I do not know how to stuff two shorts into an int.
                          Shift and or. To extract them, shift and and. http://java.sun.com/docs/books/tutorial/java/nutsandbolts/op3.html
                          • 10. Re: Ultra fast performance using primitives
                            796440
                            As for performance: The part of the code that
                            creates the hashtable will be called up to
                            500.000 times per second. The average table size will
                            be between 10 and 40 elements.
                            40 elements is pretty small. You have an int for a key and two shorts for a value? Why not just use a sorted array of longs. The high order 32 bits of each element are the int key, and the low order 32 bits are the 2-short value.

                            Or two arrays of ints. The first is sorted (for a binary search) and then once you have the index, use that to index into the second array of ints holding the values (or 2 arrays of shorts).

                            I don't know which is fastest: comparing and bit masking/shifting a single a array of longs vs. 2 arrays of ints vs. an array of ints and 2 arrays of shorts
                            • 11. Re: Ultra fast performance using primitives
                              796440
                              As for performance: The part of the code that
                              creates the hashtable will be called up to
                              500.000 times per second. The average table size
                              will
                              be between 10 and 40 elements.
                              40 elements is pretty small.
                              So instead of creating, say, 100 arrays of 40, you could create one array of 4000, and use different chunks of it for different iterations. Or just create the one array and reuse it, if you're only using one at a time.
                              • 12. Re: Ultra fast performance using primitives
                                796440
                                I'm assuming the int keys can be arbitrary int values? Or at least in a range that's too wide to make it practical to have the int key value translate directly to the index into an array?
                                • 13. Re: Ultra fast performance using primitives
                                  807606
                                  I'm assuming the int keys can be arbitrary int
                                  values? Or at least in a range that's too wide to
                                  make it practical to have the int key value translate
                                  directly to the index into an array?
                                  Good point. I know that my int keys fall in the range -32k --> +32K . So given that performance is extreme in this case I might pre-allocate an array of 64k ints and use the array index as the key. This will be a gigantic array where only a few of the elements are actually used. It will be fast, no doubt, but I would also need a very fast method of clearing the values in the array between each iteration. Hmm, don't think it will fly but thanks for input.
                                  • 14. Re: Ultra fast performance using primitives
                                    807606
                                    Just wanted to partly answer my own question of how to pack 2 shorts into an int and unpack them again. Thanks for pointers.

                                    Packing:
                                    short s1 = 525;
                                    short s2 = -6345;
                                    
                                    int x =  (s2 << 16) |  s1;
                                    Unpacking:
                                            
                                    int s1_unpacked = x & 0x0000ffff;
                                    int s2_unpacked  = (x & 0xffff0000) >> 16;
                                    1 2 Previous Next