A HashMap is a part of Java's Collections Framework, and it stores data in key-value pairs. Internally, it uses a hash table to store entries, making it efficient for fast lookups, insertions, and deletions. The underlying structure of a HashMap is based on a dynamic array of buckets (or "bins") and the hashing technique for quick access.
Here’s an overview of how HashMap works internally:
1. Structure of HashMap
A HashMap consists of the following key components:
Buckets: A bucket is essentially an index in an array where hash collisions are stored. A
HashMapis backed by an array, and each entry (a key-value pair) is stored in one of the array's indices (buckets).Entries (Key-Value Pairs): Each entry is a key-value pair, stored in the buckets.
Hashing: A hash function is used to determine the index (bucket) where a key-value pair will be stored.
2. Hashing Process
When a key-value pair is added to a HashMap, the key undergoes a hashing process to determine where the key-value pair should be placed:
Hash Code Calculation:
ThehashCode()method is called on the key. ThehashCode()is an integer that uniquely identifies the key's identity (although different keys can have the same hash code, leading to a hash collision).Bucket Index Calculation:
The hash code is then processed to determine the index in the array where the key-value pair should be stored. Typically, a formula like this is used to calculate the bucket index:Here,
array.length - 1ensures that the index is within the bounds of the array.Handling Hash Collisions:
If multiple keys have the same hash code (a collision), theHashMapstores them in a linked list or a red-black tree (from Java 8 onward) at the same bucket. This helps ensure that even with collisions, theHashMapremains efficient.
3. Add Method (put())
When a new key-value pair is added to the HashMap using the put() method, the following steps occur:
Calculate Hash Code and Index:
The hash code of the key is calculated, and the bucket index is determined.Check for Existing Key:
If there is already an entry in the bucket at that index:- If the key already exists, the existing value is replaced with the new value.
- If the key doesn’t exist, a new entry (node) is added to the bucket, either at the beginning or at the end of the chain (for linked lists) or as a child node in the red-black tree (for high collision cases).
Rehashing:
If the number of elements in theHashMapexceeds a certain threshold (determined by the load factor, typically0.75), theHashMapautomatically resizes and rehashes all the entries into a larger array.
4. Get Method (get())
When retrieving a value from a HashMap using the get() method:
Calculate Hash Code and Index:
The hash code of the key is calculated and used to find the corresponding bucket.Search the Bucket:
If there are multiple entries in the bucket (due to hash collisions), theHashMaptraverses the chain of entries (either in the linked list or the red-black tree) and compares the keys using theequals()method to find the matching key.Return the Value:
Once the matching key is found, the associated value is returned.
5. Remove Method (remove())
To remove an entry:
Calculate Hash Code and Index:
The hash code of the key is calculated, and the index of the corresponding bucket is determined.Search the Bucket:
The bucket is searched, and if there are multiple entries (due to collisions), theequals()method is used to identify the matching key.Remove the Entry:
Once the entry is found, it is removed from the linked list or red-black tree. If the bucket becomes empty, it is marked as empty.
6. Resizing and Rehashing
When the HashMap reaches a certain load factor (usually 0.75), it automatically resizes:
New Array Creation:
A new, larger array is created (typically double the size of the original array).Rehashing Entries:
All existing entries are rehashed and redistributed into the new array. This ensures that theHashMapremains efficient even as the number of entries grows.
7. Bucket Size and Load Factor
Initial Capacity: The number of buckets in the internal array when the
HashMapis first created. The default value is 16.Load Factor: Determines when the
HashMapshould resize. The default load factor is0.75, which means that theHashMapwill resize when it is 75% full.
Example:
8. Example of HashMap in Action
Key Points to Remember
- Hashing: The
HashMapuses thehashCode()andequals()methods to determine where to store and retrieve entries. - Buckets: Entries are stored in an array of buckets, and collisions are handled using a linked list or red-black tree.
- Resizing: The
HashMapautomatically resizes when it exceeds the load factor threshold. - Performance:
- Average Time Complexity for
put(),get(), andremove()is O(1), but it can degrade to O(n) in the worst case (for hash collisions).
- Average Time Complexity for
Comments
Post a Comment