1 2 3 Previous Next 42 Replies Latest reply: Dec 20, 2004 3:49 PM by 800323 RSS

    HashSort?

    807596
      Okay, I was just writing a method that sorts an array of ints with very, very, quick runtime- much better than Quicksort and even Radix.

      public static int[] sort(int[] a)
      {
           int[] sorted = new int[getMax(a)+1];
           for(int i = 0; i < a.length; i++)
                sorted[a[i]] = a; //assigns the value to the array[value] somewhat like hashing
           int free = 0;
           /*
           *Note: This for loop is optional, it just eliminates empty spaces in O(n)
           */
           for(int k = 0; k < sorted.length; k++)
           {
                if(sorted[k]==0)
                     free = k;
                else
                {
                sorted[free] = sorted[k];
                sorted[k] = 0;
                }
           }
           return sorted;
      }

      This makes this algorithm O(n+N), where the least efficient aspect is eliminating the empty spaces(zeros). As of now, I am not concerned about duplicates (dont want to get into buckets) and the numbers must be positive as well.
      So, I had two questions:
      1.)
      Is there any way to improve the efficiency of this sort? (runtime, not O notation)
      - a thought- would using an ArrayList make it more efficient? (eliminates one loop)
      the downside would be that I would have to store the values as Integers.
      Also, would using ArrayList.trimToSize() be a good idea here?

      2.)
      And, is there any easy way to apply this to objects?
      -here is my plan (very very inefficient though)
      use obj.toString() to gen a distinct string value for each object
      convert this string to a char array (String.toCharArray())
      and than convert each character to a number to use as a key
      but this makes the algorithm n^2, ie totally useless in my case

      Sorry for the intolerable length of this post.
      Thanks for you efforts...
        • 1. Re: HashSort?
          807596
          1.)
          Is there any way to improve the efficiency of this
          sort? (runtime, not O notation)
          - a thought- would using an ArrayList make it more
          efficient? (eliminates one loop)
          the downside would be that I would have to store the
          values as Integers.
          Try it and find out.
          Also, would using ArrayList.trimToSize() be a good
          idea here?
          Try it and find out.

          >
          2.)
          And, is there any easy way to apply this to objects?
          -here is my plan (very very inefficient though)
          use obj.toString() to gen a distinct string value for
          each object
          convert this string to a char array
          (String.toCharArray())
          and than convert each character to a number to use as
          a key
          Why not just call hashCode() on each object?
          • 2. Re: HashSort?
            807596
            yes, i thought about using hashCode() before.
            There is a problem however:
            i thought the hashCode() was just a random number that is unique to the object.
            If the hashCode() is random, it is no use for ordering the object.
            At least that was my understanding:
            String s1 = "abc";
            String s2 = "xyz";
            s1.hashCode() is not necessarily less than s2.hashCode(), correct?
            • 3. Re: HashSort?
              807596
              i thought the hashCode() was just a random number that is unique to the object.
              no, it's not random. and no, it's not unique...
              • 4. Re: HashSort?
                807596
                If the hashCode() is random, it is no use for ordering the object.
                At least that was my understanding:
                String s1 = "abc";
                String s2 = "xyz";
                s1.hashCode() is not necessarily less than s2.hashCode(), correct?
                why s1.hashCode() should be less than s2.hashCode()? did you mean the only one sort exists in world is ascending sort (if 'a' < 'z').

                there is a lot way to sort object. you can assume "abc" < "xyz" in String object. but what about other Object. eg.
                public class CoordinatPoint {
                   int x, y;
                }
                does (5, 4) > (4, 5)?
                • 5. Re: HashSort?
                  807596
                  Picture a String as a (sometimes) giant base 128 (depends on the charset) number. The best you can get is a radix sort, which means Strings can be sorted in Order(n * lgn( n )), where lgn = log base 128.

                  The sort you show is an O(3n + k) sort. I worked on one about 9 months back, and got it to run for negative numbers as well as positive, but never for lists where the max - min >= Integer.MAX_VALUE.

                  Anyway, you can make you version faster by thinking about this:
                  What's the runtime to sort the list "new int[]{0}" versus "new int[]{Integer.MAX_VALUE - 1}"?
                  • 6. Re: HashSort?
                    807596
                    And to respond to your other post
                    can be applied to other Objects as well
                    (or something like that)

                    Yes. Any Object which can be represented by an int in the range [0, Integer.MAX_VALUE).
                    • 7. Re: HashSort?
                      807596
                      To quickly compare objects, you could cast your object to a Comparable object, and just use the Comparable.compareTo(Object o) method, i think this works for all library classes, but you will have to implement Comparable on any of the classes you make, I used this on my most recent tree(got it from a friend tho) and it works pretty fast, way faster than the toString() method, trust me.
                      • 8. Re: HashSort?
                        807596
                        To quickly compare objects, you could cast your
                        object to a Comparable object, and just use the
                        Comparable.compareTo(Object o) method, i think this
                        works for all library classes, but you will have to
                        implement Comparable on any of the classes you make...
                        I would posit to say he already knew this, but good advice nonetheless...
                        • 9. Re: HashSort?
                          807596
                          This makes this algorithm O(n+N),
                          If you can perform a general sort in O(N) you've made a teoretical breakthrough you should publish. But this isn't a general sort because you utilize a special value structure. You require that the values are integers in a limited range only.

                          It's some kind of "set" sort. Basically you add the values to a set. Then you loop through the values from lower to higher and check whether each value is in the set. Unders circumstances this can be fast but also very slow. Say you have two values 0 and 7632547. You'll have to create an array of size 7632548 and then loop through all values between 0 and 7632547 to see if they're in the array (the set).

                          1) Is there any way to improve the efficiency of this sort?

                          No because it already fully utilizes a special structure of the values. You could use a more memory efficient implementation of a set though (a bit array)

                          2) And, is there any easy way to apply this to objects?

                          You can generalize it to objects but only if you keep the special structure. To do that the objects must be numbered and the numbers must reflect the relative order among objects.

                          So this sorting algoritm is limited to a special value structure. The runtime characteristics can be awful (few values in a large range). To generalize you must first number to objects so that each number reflects the objects place in the sequence. Yhis basically involves a general sort of the objects before you apply your sorting algoritm -:)
                          • 10. Re: HashSort?
                            807596
                            It's some kind of "set" sort. Basically you add the
                            values to a set. Then you loop through the values
                            from lower to higher and check whether each value is
                            in the set. Unders circumstances this can be fast but
                            also very slow. Say you have two values 0 and
                            7632547. You'll have to create an array of size
                            7632548 and then loop through all values between 0
                            and 7632547 to see if they're in the array (the
                            set).
                            So for such a small array, K is huge. But if there were enough values, K would be inconsequential. The basic N-ness of this sort is realized because of the limited range of an integer. If it were to sort BigIntegers, N-ness would be impossible.

                            1) Is there any way to improve the efficiency of this
                            sort?

                            No because it already fully utilizes a special
                            structure of the values. You could use a more memory
                            efficient implementation of a set though (a bit
                            array)
                            There is a way. The order he runs at now is O(3n + max), where it could be O(3n + (max - min)).

                            2) And, is there any easy way to apply this to
                            objects?

                            You can generalize it to objects but only if you keep
                            the special structure. To do that the objects must be
                            numbered and the numbers must reflect the relative
                            order among objects.
                            And the numbers have to be in the proper range.

                            So this sorting algoritm is limited to a special
                            value structure. The runtime characteristics can be
                            awful (few values in a large range). To generalize
                            you must first number to objects so that each number
                            reflects the objects place in the sequence. This
                            basically involves a general sort of the objects
                            before you apply your sorting algoritm -:)
                            Like a counting sort, which is quadratic...
                            • 11. Re: HashSort?
                              807596
                              There is a way. The order he runs at now is O(3n +
                              max), where it could be O(3n + (max - min)).
                              Well, the OP states he has an O(n+N) algoritms, and to me it looks like one, if N is the number of values and n is the value range.

                              First the n values must be inserted into a set using n operations. Then all values from min to max must be checked to see if they're included in the set. That's N = max-min+1 operations.
                              • 12. Re: HashSort?
                                807596
                                This makes this algorithm O(n+N), where the least
                                efficient aspect is eliminating the empty
                                spaces(zeros).
                                The cost of your algorithm is ties to the magnitude of the values. So even if it is technically O(n) (which I'm not so sure about) if you don't eliminate the spaces, whatever uses the array later will have to iterate over them. You are just passing the cost of the algorithm to something else.

                                It's n squared because you have to run the for loop to eliminate spaces for each number in the sort.

                                For example. I give your method three numbers - 1000, 100,000 and 100,000,000. That would take your method a lot longer than any of the normal sorts.

                                I think in quantum computing this is a common approach because you can search all elements of an array at once in a quantum computer.
                                • 13. Re: HashSort?
                                  807596
                                  *Note: This for loop is optional, it just eliminates
                                  s empty spaces in O(n)
                                       */
                                       for(int k = 0; k < sorted.length; k++)
                                       {
                                            if(sorted[k]==0)
                                                 free = k;
                                            else
                                            {
                                            sorted[free] = sorted[k];
                                            sorted[k] = 0;
                                            }
                                       }
                                       return sorted;
                                  }
                                  That doesn't work.

                                  input=1,5, 10

                                  [01][00][00][00][05][00][00][00][00][10]

                                  When it reaches position 4, moves '05' in to position 1. The next iteration finds an empty space and sets free to 5. When it reaches position 9. It moves '10' to position 5. It stops. Also, note that if 6 was in the input, value '05' would be overwritten and lost.
                                  • 14. Re: HashSort?
                                    807596
                                    For example. I give your method three numbers - 1000, 100,000 and 100,000,000. That would take your method a lot longer than any of the normal sorts. <
                                    Yeah, but that's for a list of three elements. If you had that list 5000000 times over, just those three input values, but a much longer list, the disparity in the range would have less of an impact, and this algorithm would smoke every other sort in existance.
                                    1 2 3 Previous Next