Forum Stats

  • 3,751,300 Users
  • 2,250,339 Discussions
  • 7,867,378 Comments

Discussions

HashCode and equals problem

Prem
Prem Member Posts: 88
edited Sep 6, 2011 11:10AM in Java Programming
Hi I am implementing hashcode and equals method to avoid duplicates in Set collection. I successfully did it but confused ..

following is code snippet :


package TestCollection;

import java.util.Collections;
import java.util.Comparator;
import java.util.HashSet;
import java.util.SortedSet;
import java.util.TreeSet;

class Student
{

private int id;
private String fname;
public int getId() {
return id;
}
public void setId(int id) {
this.id = id;
}
public String getFname() {
return fname;
}
public void setFname(String fname) {
this.fname = fname;
}



@Override
public int hashCode() {

System.out.println("hashcode -->"+this.id+"--"+this.fname);
return this.id;
}

@Override
public boolean equals(Object o) {
System.out.println("equals -->"+((Student)o).getFname());
boolean bool = false;
if(this.id == ((Student)o).getId())
bool = true;
System.out.println("#############################"+this.getFname());
return bool;
}

@Override
public String toString() {
// TODO Auto-generated method stub
String s=""+this.id+" "+this.fname;
return s;
}
}

public class TestSetToAvoidDuplicateOfUserDefinedClass
{

public static void main(String[] args) {

HashSet tSet=new HashSet();
Student s1=new Student();

s1.setId(1);
s1.setFname("abc");

Student s2=new Student();
s2.setId(1);
s2.setFname("xyz");

Student s3=new Student();
s3.setId(2);
s3.setFname("abc");

Student s4=new Student();
s4.setId(3);
s4.setFname("lmn");


tSet.add(s1);
tSet.add(s2);
tSet.add(s3);
tSet.add(s4);





for(Object o:tSet)
{
System.out.println("-->"+o);
}




}


}

while adding equals method is being called only once when is s2 is about to add. For others it is not being called .. Could you please elaborate this kind of behavior...

Answers

  • 817788
    817788 Member Posts: 37
    edited Sep 5, 2011 11:07AM
    HashSet: It checks only the elements in one bucket and not all elements in the collection (Core Java 2, Vol I - Fundamentals, 8th Ed (Prentice Hall and Sun) 2008, page 670).
  • First things first: See how there is a striked out piece of text in Student´s hashCode() method? That happens if you do not use code tags to post code. Your code will only be properly (readably) formatted if you use them. To get help here, code should always be formatted, because the usual text formatting will consume certain characters like the asterisk or underscores. Unreadable code will in most cases not even be looked at, which costs you competent help.

    My (unprofessional, uneducated) guess is:

    You use setID(1) for both s1 and s2. Your hashCode() method returns id as the hashCode. Since hashing structures use the hashCode() to determine which bucket the element will be stored in, equals() is called to determine if the element is already present when you add elements with hashCode()s that "point" to the same bucket. HashSet, like all implementations of the interface Set, does not allow duplicate elements.

    http://download.oracle.com/javase/1.4.2/docs/api/java/util/HashSet.html
    http://download.oracle.com/javase/1.4.2/docs/api/java/util/Set.html
    http://en.wikipedia.org/wiki/Hash_function#Hash_tables (about the bucket thing)
  • tschodt
    tschodt Member Posts: 537
    Prem wrote:
    Hi I am implementing hashcode and equals method to avoid duplicates in Set collection. I successfully did it but confused ..

    following is code snippet :
    package TestCollection;
    
    import java.util.Collections;
    import java.util.Comparator;
    import java.util.HashSet;
    import java.util.SortedSet;
    import java.util.TreeSet;
    
    class Student
    {
    
    	private int id;
    	private String fname;
    	public int getId() {
    		return id;
    	}
    	public void setId(int id) {
    		this.id = id;
    	}
    	public String getFname() {
    		return fname;
    	}
    	public void setFname(String fname) {
    		this.fname = fname;
    	}
    	
    	
    	
    	@Override
    	public int hashCode() {
    		
    		System.out.println("hashcode -->"+this.id+"--"+this.fname);
    		return this.id;
    	}
    	
    	@Override
    	public boolean equals(Object o) {
    		System.out.println("equals -->"+((Student)o).getFname());
    		boolean bool = false;
    		if(this.id == ((Student)o).getId())
    			bool = true;
    		System.out.println("#############################"+this.getFname());
    		return bool;
    	}
    	
    	@Override
    	public String toString() {
    		// TODO Auto-generated method stub
    		String s=""+this.id+" "+this.fname;
    		return s;
    	}
    }
    
    public class TestSetToAvoidDuplicateOfUserDefinedClass
    {
    	
    	public static void main(String[] args) {
    		
    		HashSet tSet=new HashSet();
    		Student s1=new Student();
    		
    		s1.setId(1);
    		s1.setFname("abc");
    		
    		Student s2=new Student();
    		s2.setId(1);
    		s2.setFname("xyz");
    		
    		Student s3=new Student();
    		s3.setId(2);
    		s3.setFname("abc");
    		
    		Student s4=new Student();
    		s4.setId(3);
    		s4.setFname("lmn");
    		
    		
    		tSet.add(s1);
    		tSet.add(s2);		
    		tSet.add(s3);
    		tSet.add(s4);		
    		
    		
    				
    		
    
    		for(Object o:tSet)
    		{
    			System.out.println("-->"+o);
    		}
    		
    		
    		
    		
    	}
    	
    	
    }
    while adding equals method is being called only once when is s2 is about to add. For others it is not being called .. Could you please elaborate this kind of behavior...
    When hashcode values are different
    equals will always return false
    so the Set implementation does not need to invoke equals()...
  • 802316
    802316 Member Posts: 532
    edited Sep 5, 2011 11:21AM
    while adding equals method is being called only once when is s2 is about to add. For others it is not being called .. Could you please elaborate this kind of behavior...
    Hash collections place key or elements into bins based on their hashcode. When objects are in different bins, there is no need to compare them. When they fall into the same bin equals() is used for comparison. This is called a collision, something the Hashing function attempts to avoid. If you change the initial size of the HashSet to be large enough, you won't get any collision and equals will not be called at all.

    Notes:
    - when we say that HashSet or HashMap has an O(1) time complexity for add() it is assumed there is a perfect hashing function with no collisions. From experience a more realistic time complexity is O(log(log(N)) with some collisions.
    - Making the fields used in hashCode or equals mutable is best avoided. If you change one of these fields after you add it to a hash collection you can corrupt the collection and get something which looks like a memory leak. The same applies to the compareTo function for sorted collections. In this case using final fields and a constructor would make the code much shorter as well.
    - If you use a LinkedHashSet instead of HashSet (Similarly LinkedHashMap instead of HashMap) it can make debugging easier as the elements in the collection appear in a predictable order.
  • 796440
    796440 Member Posts: 19,179
    Peter Lawrey wrote:
    If you change the initial size of the HashSet to be large enough, you won't get any collision
    Not true. It decreases the likelihood of a collision, but whether a collision occurs or not still depends on the data being stored and the hashing function.
  • 836548
    836548 Member Posts: 286
    Prem wrote:
    while adding equals method is being called only once when is s2 is about to add. For others it is not being called .. Could you please elaborate this kind of behavior...
    the hashCode method is called whenever add is performed on HashSet, HashMap and HashTable.

    In your program, you are setting same id (i.e 1) for both s1 and s2. and in hashCode you have returned the Id, which is unique.
    If you set the same id for object s3 and s4, you will observe the equal method will also get called for s3 and s4 as well.
    I suggest to go through the below link to get more understanding of hashCode and equals methods:
    http://www.technofundo.com/tech/java/equalhash.html
  • darrylburke
    darrylburke Member Posts: 18,007
    Moderator advice: Please read the announcement(s) at the top of the forum listings and the FAQ linked from every page. They are there for a purpose.

    Then edit your post and format the code correctly.

    db
  • YoungWinston
    YoungWinston Member Posts: 4,310
    edited Sep 6, 2011 7:43AM
    Prem wrote:
    Hi I am implementing hashcode and equals method to avoid duplicates in Set collection.
    Further to all the above good advice: the only time that hashCode() is used to check for duplication is in a hashed Set or Map, and even then it's only used as a "first cut" (as has already been said, to determine the bucket that the item should be in); equals() is always the final arbiter as to whether something is "the same as" an existing item or not.

    This is the reason why equals() and hashCode() MUST be based on the same things.

    BTW, your equals() method could do with some improvement too:
    if(this.id == ((Student)o).getId())
    will throw a ClassCastException whenever you pass something that is not a Student, and a NullPointerException if <tt>o</tt> is <tt>null</tt>; and that's kind of tacky.

    The usual pattern for equals is:
    public boolean equals(Object o) {
       if (this == o)
          return true;
       if (!(o instanceof Student))
          return false;
       Student other = (Student) o;
       // now do the rest of the equality check with 'other'
    }
    Some may argue that 'instanceof' is not specific enough, but I've rarely found it to be a problem (and it does cover the case when the passed object is <tt>null</tt>).

    I suggest you read 833545's link (and maybe a few others) to improve your knowledge on these methods, because they are important.

    Winston
  • 796440
    796440 Member Posts: 19,179
    YoungWinston wrote:
    Some may argue that 'instanceof' is not specific enough
    It depends on the nature of the type hierarchy. If objects of different classes are meant to be equal, you have to use instanceof (or isAssignableFrom()) against the "root" type. For example, any List can be equal to any other List, so both ArrayList and LinkedList have to check for instanceof List. If, on the other hand, your requirements dictate that an object can only be equal to another object of the same class, then you need to use getClass(), after a separate check for null.
This discussion has been closed.