Skip to main content

Internal Working of HashMap in Java

 

A HashMap is a part of Java's Collections Framework, and it stores data in key-value pairs. Internally, it uses a hash table to store entries, making it efficient for fast lookups, insertions, and deletions. The underlying structure of a HashMap is based on a dynamic array of buckets (or "bins") and the hashing technique for quick access.

Here’s an overview of how HashMap works internally:


1. Structure of HashMap

A HashMap consists of the following key components:

  1. Buckets: A bucket is essentially an index in an array where hash collisions are stored. A HashMap is backed by an array, and each entry (a key-value pair) is stored in one of the array's indices (buckets).

  2. Entries (Key-Value Pairs): Each entry is a key-value pair, stored in the buckets.

  3. Hashing: A hash function is used to determine the index (bucket) where a key-value pair will be stored.

2. Hashing Process

When a key-value pair is added to a HashMap, the key undergoes a hashing process to determine where the key-value pair should be placed:

  1. Hash Code Calculation:
    The hashCode() method is called on the key. The hashCode() is an integer that uniquely identifies the key's identity (although different keys can have the same hash code, leading to a hash collision).

  2. Bucket Index Calculation:
    The hash code is then processed to determine the index in the array where the key-value pair should be stored. Typically, a formula like this is used to calculate the bucket index:


    int index = hashCode & (array.length - 1);

    Here, array.length - 1 ensures that the index is within the bounds of the array.

  3. Handling Hash Collisions:
    If multiple keys have the same hash code (a collision), the HashMap stores them in a linked list or a red-black tree (from Java 8 onward) at the same bucket. This helps ensure that even with collisions, the HashMap remains efficient.


3. Add Method (put())

When a new key-value pair is added to the HashMap using the put() method, the following steps occur:

  1. Calculate Hash Code and Index:
    The hash code of the key is calculated, and the bucket index is determined.

  2. Check for Existing Key:
    If there is already an entry in the bucket at that index:

    • If the key already exists, the existing value is replaced with the new value.
    • If the key doesn’t exist, a new entry (node) is added to the bucket, either at the beginning or at the end of the chain (for linked lists) or as a child node in the red-black tree (for high collision cases).
  3. Rehashing:
    If the number of elements in the HashMap exceeds a certain threshold (determined by the load factor, typically 0.75), the HashMap automatically resizes and rehashes all the entries into a larger array.


4. Get Method (get())

When retrieving a value from a HashMap using the get() method:

  1. Calculate Hash Code and Index:
    The hash code of the key is calculated and used to find the corresponding bucket.

  2. Search the Bucket:
    If there are multiple entries in the bucket (due to hash collisions), the HashMap traverses the chain of entries (either in the linked list or the red-black tree) and compares the keys using the equals() method to find the matching key.

  3. Return the Value:
    Once the matching key is found, the associated value is returned.


5. Remove Method (remove())

To remove an entry:

  1. Calculate Hash Code and Index:
    The hash code of the key is calculated, and the index of the corresponding bucket is determined.

  2. Search the Bucket:
    The bucket is searched, and if there are multiple entries (due to collisions), the equals() method is used to identify the matching key.

  3. Remove the Entry:
    Once the entry is found, it is removed from the linked list or red-black tree. If the bucket becomes empty, it is marked as empty.


6. Resizing and Rehashing

When the HashMap reaches a certain load factor (usually 0.75), it automatically resizes:

  1. New Array Creation:
    A new, larger array is created (typically double the size of the original array).

  2. Rehashing Entries:
    All existing entries are rehashed and redistributed into the new array. This ensures that the HashMap remains efficient even as the number of entries grows.


7. Bucket Size and Load Factor

  • Initial Capacity: The number of buckets in the internal array when the HashMap is first created. The default value is 16.

  • Load Factor: Determines when the HashMap should resize. The default load factor is 0.75, which means that the HashMap will resize when it is 75% full.

Example:


// Creating a HashMap with initial capacity and load factor HashMap<String, Integer> map = new HashMap<>(16, 0.75f);

8. Example of HashMap in Action


import java.util.HashMap; public class HashMapExample { public static void main(String[] args) { HashMap<String, String> map = new HashMap<>(); // Adding key-value pairs to the HashMap map.put("Apple", "Fruit"); map.put("Carrot", "Vegetable"); map.put("Banana", "Fruit"); // Retrieving values System.out.println("Apple is a " + map.get("Apple")); // Output: Fruit System.out.println("Carrot is a " + map.get("Carrot")); // Output: Vegetable // Removing a key-value pair map.remove("Banana"); // Checking if a key exists System.out.println("Contains 'Banana'? " + map.containsKey("Banana")); // Output: false } }

Key Points to Remember

  • Hashing: The HashMap uses the hashCode() and equals() methods to determine where to store and retrieve entries.
  • Buckets: Entries are stored in an array of buckets, and collisions are handled using a linked list or red-black tree.
  • Resizing: The HashMap automatically resizes when it exceeds the load factor threshold.
  • Performance:
    • Average Time Complexity for put(), get(), and remove() is O(1), but it can degrade to O(n) in the worst case (for hash collisions).

Comments

Popular posts from this blog

Interview questions - Strings

1. Why Strings are immutable?  In Java, String is immutable , meaning once a String object is created, its value cannot be changed . Any operation that seems to modify a String actually creates a new object instead of modifying the existing one. Lets understand with below program.  find code here The String Constant Pool (also known as the String Intern Pool ) is a special memory area inside the heap where Java stores string literals to optimize memory usage and improve performance. When we create String password = "password@123"; using literal it will create instance in String constant pool and when we assign password = "changepassword", it will not change previous instance, but create new one inside String constant pool. Check hash code before and after changing value both are different. Lets understand scenario2 : code public class StringMemoryCheck {     public static void main(String[] args) {         String s1 = "Java";  // String...

Coding - Mostly Asked Coding Interview Question

 1. Print Mostly occurred string in Hello String import java.util.Map ; import java.util.Optional ; import java.util.stream.Collectors ; public class MostOccuringCharacter { // print most occuring character in hello string and write program to use optional public static void main ( String [] args ) { String str = "Hello" ; Optional < Map . Entry < Character , Long >> character = str .chars() .mapToObj( x -> ( char ) x ) .collect( Collectors . groupingBy ( x -> x , Collectors . counting ())) .entrySet() .stream() .max( Map . Entry . comparingByValue ()); if ( character .isPresent()){ System . out .println( "Most occurring character: " + character .get().getKey()); } //or character .ifPresent( c -> System . out .println( "Most occurring character: " + c .getKey())); } }