How To Add Unique Objects In Java HashSet

Introduction

Here in this example I am going to show you how to add unique objects in Java HashSet. Set interface in Java maintains uniqueness, so any duplicacy is not allowed in Set interface. The HashSet class maintains uniqueness through HashMap, so HashSet uses internally HashMap to determine unique elements.

When you add String type data or any primitive type data into HashSet then it is fine that your elements will be unique in HashSet but what if you add custom objects to HashSet; how uniqueness will be maintain? Based on which field of your Java class, the HashSet should determine the uniqueness? In this case you need to override equals() and hashCode() to maintain the uniqueness.

What is HashSet

  • HashSet extends AbstractSet and is an implementation of Set interface.
  • HashSet also implements Serializable and Cloneable interfaces.
  • HashSet is backed by hash table(actually HashMap instance), i.e., the HashSet uses hash table(HashMap) to store collection elements.
  • Like HashMap, it allows null only one element.

Elements order in HashSet

HashSet makes no guarantees as to the iteration order of the set; in particular, it does not guarantee that the order will remain constant over time.

Performance of HashSet

This class offers constant time performance for the basic operations (add, remove, contains and size), assuming the hash function disperses the elements properly among the buckets.

Like HashMap, two parameters that affect the performance of the HashSet instance: capacity and load factor.

The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created.

The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed, i.e., internal data structures are rebuilt, so that the hash table has approximately twice the number of buckets.

Iteration over this set requires time propertional to the sum of HashSet’s instances size(number of elements) plus the capacity (the number of buckets) of the HashMap. Therefore, it is highly recommended not to set the initial capacity too high (or load factor too low) if iteration performance is important.

As a general thumb of rule, the default load factor (.75) offers a good tradeoff between time and space costs.

Accessing in Multi-threaded Environment

Note that the HashSet implementation is not synchronized. So multiple threads access a set concurrently, and at least one of the threads modifies the set structurally, it must be synchronized externally. This is typically accomplished by synchronizing on some object that naturally encapsulates the set. If no such object exists, the set should be “wrapped” using the Collections.synchronizedSet method. This is best done at creation time, to prevent accidental unsynchronized access to the set:

Set s = Collections.synchronizedSet(new HashSet(...));

HashSet is Fail-fast

If the set is structurally modified at any time after the iterator is created, in any way except through the iterator’s own remove method, the iterator will throw a ConcurrentModificationException.  Thus, in the face of concurrent modification, the iterator fails quickly and cleanly, rather than risking arbitrary, non-deterministic behavior at an undetermined time in the future.

The fail-fast behavior of an iterator cannot be guaranteed and iterators throw ConcurrentModificationException on a best-effort basis. Therefore, it would be wrong to write a program that depended on this exception for its correctness: the fail-fast behavior of iterators should be used only to detect bugs.

Internal Working of HashSet

When you look into the HashSet.java class’s source code, you find something similar to below code:

public class HashSet
    extends AbstractSet
    implements Set, Cloneable, java.io.Serializable
{
    private transient HashMap<E,Object> map;
    // Dummy value to associate with an Object in the backing Map
    private static final Object PRESENT = new Object();
    public HashSet() {
        map = new HashMap<>();
    }
    public boolean add(E e) {
        return map.put(e, PRESENT)==null;
    }
    /**
    * Some other code
    */
}

It is clear from the above source code that the set achieves uniqueness through HashMap. You might know that each element in HashMap is unique. So when an instance of HashSet is created, it basically creates an instance of HashMap. When an element is added to the HashSet, it is actually added to the HashMap as a key using add(E e) method. Now a value need to be associated with key, so a dummy value PRESET (private static final Object PRESENT = new Object();) is associated with the every key in HashMap.

Now look at the add(E e) method:

public boolean add(E e) {
    return map.put(e, PRESENT)==null;
}

So here there will be possibly two cases:

  1. map.put(e,PRESENT) will return null, if element is not present in the map. So map.put(e, PRESENT) == null will return true, hence add method will return true and element will be added in HashSet.
  2. map.put(e,PRESENT) will return old value, if the element is already present in the map. So map.put(e, PRESENT) == null will return false, hence add method will return false and element will not be added in HashSet.

HashSet Example

package com.roytuts.collections;
import java.util.HashSet;
import java.util.Set;
public class HashSetExample {
    public static void main(String[] args) {
        Set<String> set = new HashSet<>();
        set.add("a");
        set.add("b");
        set.add("c");
        set.add("A");
        set.add("B");
        set.add("C");
        set.add("a");
        System.out.println(set);
    }
}

Output

[a, A, b, B, c, C]

So what happened when I passes duplicate element(set.add(e)) to the HashSet. The add(e) method in HashSet returns false when the element exists in the HashSet, otherwise it returns true. Therefore it did not added the duplicate element to the HashSet.

Adding Custom Objects to HashSet

It is really important to override equals() and hashCode() for any object you are going to store in HashSet. Because the object is used as key in map, must override those method to maintain uniqueness of elements or objects.

Add Unique Objects

Create Employee object. Notice I have override equals() and hasCode() method through which I am making Employee object unique based on attributes – name and address.

public class Employee {

	private int id;
	private String name;
	private String address;

	public Employee() {
	}

	public Employee(int id, String name, String address) {
		this.id = id;
		this.name = name;
		this.address = address;
	}

	public int getId() {
		return id;
	}

	public void setId(int id) {
		this.id = id;
	}

	public String getName() {
		return name;
	}

	public void setName(String name) {
		this.name = name;
	}

	public String getAddress() {
		return address;
	}

	public void setAddress(String address) {
		this.address = address;
	}

	@Override
	public int hashCode() {
		return Objects.hash(address, name);
	}

	@Override
	public boolean equals(Object obj) {
		if (this == obj)
			return true;
		if (obj == null)
			return false;
		if (getClass() != obj.getClass())
			return false;
		Employee other = (Employee) obj;
		return Objects.equals(address, other.address) && Objects.equals(name, other.name);
	}

	@Override
	public String toString() {
		return "Employee [id=" + id + ", name=" + name + ", address=" + address + "]";
	}

}

Create HashSet test class to test the uniqueness of objects added to the HashSet.

public class HashsetApp {

	public static void main(String[] args) {
		Employee e1 = new Employee(1000, "Liton", "Falakata");
		Employee e2 = new Employee(1001, "Liton", "Falakata");
		Employee e3 = new Employee(1000, "Liton", "Falakata");
		Employee e4 = new Employee(1003, "Debabrata", "Birati");
		Employee e5 = new Employee(1000, "Souvik", "Kalighat");

		Set<Employee> set = new HashSet<>();
		set.add(e1);
		set.add(e2);
		set.add(e3);
		set.add(e4);
		set.add(e5);

		// for (Employee employee : set) {
		// System.out.println(employee);
		// }

		set.stream().forEach(s -> System.out.println(s));
	}

}

Output

Running the above main class will produce the following output:

Employee [id=1000, name=Souvik, address=Kalighat]
Employee [id=1000, name=Liton, address=Falakata]
Employee [id=1003, name=Debabrata, address=Birati]

If you do not override hashCode() and equals() method, the output will be:

Employee [id=1003, name=Debabrata, address=Birati]
Employee [id=1000, name=Souvik, address=Kalighat]
Employee [id=1000, name=Liton, address=Falakata]
Employee [id=1001, name=Liton, address=Falakata]
Employee [id=1000, name=Liton, address=Falakata]

So you got an idea how to add unique objects in Java HashSet.

Source Code

Download

Leave a Reply

Your email address will not be published. Required fields are marked *