1 2 3 Previous Next 42 Replies Latest reply on Jan 7, 2005 2:48 AM by 807596

# HashSort?

Okay, I was just writing a method that sorts an array of ints with very, very, quick runtime- much better than Quicksort and even Radix.

public static int[] sort(int[] a)
{
int[] sorted = new int[getMax(a)+1];
for(int i = 0; i < a.length; i++)
sorted[a[i]] = a; //assigns the value to the array[value] somewhat like hashing
int free = 0;
/*
*Note: This for loop is optional, it just eliminates empty spaces in O(n)
*/
for(int k = 0; k < sorted.length; k++)
{
if(sorted[k]==0)
free = k;
else
{
sorted[free] = sorted[k];
sorted[k] = 0;
}
}
return sorted;
}

This makes this algorithm O(n+N), where the least efficient aspect is eliminating the empty spaces(zeros). As of now, I am not concerned about duplicates (dont want to get into buckets) and the numbers must be positive as well.
1.)
Is there any way to improve the efficiency of this sort? (runtime, not O notation)
- a thought- would using an ArrayList make it more efficient? (eliminates one loop)
the downside would be that I would have to store the values as Integers.
Also, would using ArrayList.trimToSize() be a good idea here?

2.)
And, is there any easy way to apply this to objects?
-here is my plan (very very inefficient though)
use obj.toString() to gen a distinct string value for each object
convert this string to a char array (String.toCharArray())
and than convert each character to a number to use as a key
but this makes the algorithm n^2, ie totally useless in my case

Sorry for the intolerable length of this post.
Thanks for you efforts...
• ###### 1. Re: HashSort?
1.)
Is there any way to improve the efficiency of this
sort? (runtime, not O notation)
- a thought- would using an ArrayList make it more
efficient? (eliminates one loop)
the downside would be that I would have to store the
values as Integers.
Try it and find out.
Also, would using ArrayList.trimToSize() be a good
idea here?
Try it and find out.

>
2.)
And, is there any easy way to apply this to objects?
-here is my plan (very very inefficient though)
use obj.toString() to gen a distinct string value for
each object
convert this string to a char array
(String.toCharArray())
and than convert each character to a number to use as
a key
Why not just call hashCode() on each object?
• ###### 2. Re: HashSort?
yes, i thought about using hashCode() before.
There is a problem however:
i thought the hashCode() was just a random number that is unique to the object.
If the hashCode() is random, it is no use for ordering the object.
At least that was my understanding:
String s1 = "abc";
String s2 = "xyz";
s1.hashCode() is not necessarily less than s2.hashCode(), correct?
• ###### 3. Re: HashSort?
i thought the hashCode() was just a random number that is unique to the object.
no, it's not random. and no, it's not unique...
• ###### 4. Re: HashSort?
If the hashCode() is random, it is no use for ordering the object.
At least that was my understanding:
String s1 = "abc";
String s2 = "xyz";
s1.hashCode() is not necessarily less than s2.hashCode(), correct?
why s1.hashCode() should be less than s2.hashCode()? did you mean the only one sort exists in world is ascending sort (if 'a' < 'z').

there is a lot way to sort object. you can assume "abc" < "xyz" in String object. but what about other Object. eg.
``````public class CoordinatPoint {
int x, y;
}``````
does (5, 4) > (4, 5)?
• ###### 5. Re: HashSort?
Picture a String as a (sometimes) giant base 128 (depends on the charset) number. The best you can get is a radix sort, which means Strings can be sorted in Order(n * lgn( n )), where lgn = log base 128.

The sort you show is an O(3n + k) sort. I worked on one about 9 months back, and got it to run for negative numbers as well as positive, but never for lists where the max - min >= Integer.MAX_VALUE.

What's the runtime to sort the list "new int[]{0}" versus "new int[]{Integer.MAX_VALUE - 1}"?
• ###### 6. Re: HashSort?
And to respond to your other post
can be applied to other Objects as well
(or something like that)

Yes. Any Object which can be represented by an int in the range [0, Integer.MAX_VALUE).
• ###### 7. Re: HashSort?
To quickly compare objects, you could cast your object to a Comparable object, and just use the Comparable.compareTo(Object o) method, i think this works for all library classes, but you will have to implement Comparable on any of the classes you make, I used this on my most recent tree(got it from a friend tho) and it works pretty fast, way faster than the toString() method, trust me.
• ###### 8. Re: HashSort?
To quickly compare objects, you could cast your
object to a Comparable object, and just use the
Comparable.compareTo(Object o) method, i think this
works for all library classes, but you will have to
implement Comparable on any of the classes you make...
I would posit to say he already knew this, but good advice nonetheless...
• ###### 9. Re: HashSort?
This makes this algorithm O(n+N),
If you can perform a general sort in O(N) you've made a teoretical breakthrough you should publish. But this isn't a general sort because you utilize a special value structure. You require that the values are integers in a limited range only.

It's some kind of "set" sort. Basically you add the values to a set. Then you loop through the values from lower to higher and check whether each value is in the set. Unders circumstances this can be fast but also very slow. Say you have two values 0 and 7632547. You'll have to create an array of size 7632548 and then loop through all values between 0 and 7632547 to see if they're in the array (the set).

1) Is there any way to improve the efficiency of this sort?

No because it already fully utilizes a special structure of the values. You could use a more memory efficient implementation of a set though (a bit array)

2) And, is there any easy way to apply this to objects?

You can generalize it to objects but only if you keep the special structure. To do that the objects must be numbered and the numbers must reflect the relative order among objects.

So this sorting algoritm is limited to a special value structure. The runtime characteristics can be awful (few values in a large range). To generalize you must first number to objects so that each number reflects the objects place in the sequence. Yhis basically involves a general sort of the objects before you apply your sorting algoritm -:)
• ###### 10. Re: HashSort?
It's some kind of "set" sort. Basically you add the
values to a set. Then you loop through the values
from lower to higher and check whether each value is
in the set. Unders circumstances this can be fast but
also very slow. Say you have two values 0 and
7632547. You'll have to create an array of size
7632548 and then loop through all values between 0
and 7632547 to see if they're in the array (the
set).
So for such a small array, K is huge. But if there were enough values, K would be inconsequential. The basic N-ness of this sort is realized because of the limited range of an integer. If it were to sort BigIntegers, N-ness would be impossible.

1) Is there any way to improve the efficiency of this
sort?

No because it already fully utilizes a special
structure of the values. You could use a more memory
efficient implementation of a set though (a bit
array)
There is a way. The order he runs at now is O(3n + max), where it could be O(3n + (max - min)).

2) And, is there any easy way to apply this to
objects?

You can generalize it to objects but only if you keep
the special structure. To do that the objects must be
numbered and the numbers must reflect the relative
order among objects.
And the numbers have to be in the proper range.

So this sorting algoritm is limited to a special
value structure. The runtime characteristics can be
awful (few values in a large range). To generalize
you must first number to objects so that each number
reflects the objects place in the sequence. This
basically involves a general sort of the objects
before you apply your sorting algoritm -:)
Like a counting sort, which is quadratic...
• ###### 11. Re: HashSort?
There is a way. The order he runs at now is O(3n +
max), where it could be O(3n + (max - min)).
Well, the OP states he has an O(n+N) algoritms, and to me it looks like one, if N is the number of values and n is the value range.

First the n values must be inserted into a set using n operations. Then all values from min to max must be checked to see if they're included in the set. That's N = max-min+1 operations.
• ###### 12. Re: HashSort?
This makes this algorithm O(n+N), where the least
efficient aspect is eliminating the empty
spaces(zeros).
The cost of your algorithm is ties to the magnitude of the values. So even if it is technically O(n) (which I'm not so sure about) if you don't eliminate the spaces, whatever uses the array later will have to iterate over them. You are just passing the cost of the algorithm to something else.

It's n squared because you have to run the for loop to eliminate spaces for each number in the sort.

For example. I give your method three numbers - 1000, 100,000 and 100,000,000. That would take your method a lot longer than any of the normal sorts.

I think in quantum computing this is a common approach because you can search all elements of an array at once in a quantum computer.
• ###### 13. Re: HashSort?
*Note: This for loop is optional, it just eliminates
s empty spaces in O(n)
*/
for(int k = 0; k < sorted.length; k++)
{
if(sorted[k]==0)
free = k;
else
{
sorted[free] = sorted[k];
sorted[k] = 0;
}
}
return sorted;
}
That doesn't work.

input=1,5, 10

[01][00][00][00][05][00][00][00][00][10]

When it reaches position 4, moves '05' in to position 1. The next iteration finds an empty space and sets free to 5. When it reaches position 9. It moves '10' to position 5. It stops. Also, note that if 6 was in the input, value '05' would be overwritten and lost.
• ###### 14. Re: HashSort?
For example. I give your method three numbers - 1000, 100,000 and 100,000,000. That would take your method a lot longer than any of the normal sorts. <
Yeah, but that's for a list of three elements. If you had that list 5000000 times over, just those three input values, but a much longer list, the disparity in the range would have less of an impact, and this algorithm would smoke every other sort in existance.
1 2 3 Previous Next