HashMap
About
HashMap
is a part of the java.util
package and is used to store key-value pairs. It is based on hashing and provides constant-time performance for basic operations like insertion, deletion, and lookup under ideal conditions.
Keys are unique, while values can be duplicated.
Allows
null
as a key (only onenull
key is allowed) and multiplenull
values.It is unsynchronized and not thread-safe by default.
Features
Key-Value Mapping: Stores data as key-value pairs where the key is unique.
Fast Access: O(1) average time complexity for most operations.
Dynamic Resizing: Automatically resizes when the number of entries exceeds the load factor.
Allows Null: Accepts one
null
key and multiplenull
values.Order: Does not guarantee any order of keys or values.
Non-Synchronized: Use
Collections.synchronizedMap()
for thread-safe operations orConcurrentHashMap
for better performance.Customizable Hashing: Custom objects can be used as keys by overriding
hashCode()
andequals()
.
Internal Working
Java's HashMap works on the principle of hashing and uses an array of linked lists (buckets) to store key-value pairs. When a key is inserted, its hash code is computed and mapped to an index in the internal array. If multiple keys hash to the same index (collision), entries are stored in a linked list (Java 7) or balanced tree (TreeNode) if collisions exceed a threshold (Java 8+). get() and put() operations ideally run in O(1) time but degrade to O(log n) in worst cases. Resizing occurs when the load factor (default 0.75) is exceeded, doubling the bucket array size and rehashing entries.
Data Structure
Internally, HashMap
maintains an array of buckets, where each bucket is either:
A Linked List (for low collisions).
A Red-Black Tree (when many keys collide, improving worst-case performance).
The array index for storing elements is determined using the hash function.
Each bucket is an instance of Node<K, V>, defined as:
Hashing
When a key-value pair is inserted, the key’s hashCode()
method generates an integer hash code.
The hash code is then compressed into a bucket index using this formula:
index=hash & (capacity−1)
hash
is the computed hash of the key (after applyinghashCode()
and additional processing).capacity
is the size of the internal bucket array (default:16
).Bitwise AND (
&
) with (capacity - 1
) ensures fast index computation while distributing entries evenly across the array.This approach works efficiently because HashMap always maintains capacity as a power of 2, ensuring that
capacity - 1
is a bitmask (all 1s in binary).Instead of using modulo (
%
) i.e. % capacity, which involves division (expensive),HashMap
uses the faster bitwise AND (&
) operation.This is based on the property that:
x % 2^n = x & (2^n−1)
.Thus, for power-of-2 capacities,hash & (capacity - 1)
is functionally equivalent tohash % capacity
, but much faster.Why
capacity - 1
?Java always keeps
capacity
as a power of 2 (e.g., 16, 32, 64, etc.). When you subtract1
from a power of 2, you get a bitmask with all bits set to1
.
Capacity (2^n
)
Binary (capacity
)
capacity - 1
Binary (capacity - 1
)
8 (2^3
)
0000 1000
(8)
7
0000 0111
(7)
16 (2^4
)
0001 0000
(16)
15
0000 1111
(15)
32 (2^5
)
0010 0000
(32)
31
0001 1111
(31)
Since
capacity - 1
has all lower bits set to 1, performing bitwise AND (&
) withhash
extracts only the lower bits.This is equivalent to
hash % capacity
but much faster because division (%
) is slow compared to bitwise operations.
Example: HashMap Index Calculation
Let's say we have a key with hashCode = 10
, and the HashMap
has a capacity of 8`.
Step 1: Compute Index Using Bitwise AND
Step 2: Binary Breakdown
Result: The bucket index is 2
Buckets
Each bucket in the array stores a linked list or a balanced binary tree (Java 8+).
If two keys hash to the same index, they are stored in the same bucket, forming a collision chain. Note that two different keys could have the same hash code, as there may be an infinite number of keys and a finite number of integers.
Collision Handling
Separate Chaining: Colliding keys are stored as nodes in a linked list within the same bucket.
Treeify (Java 8+): When a bucket's size exceeds a threshold (default: 8), the list is converted into a red-black tree for faster lookup.
Load Factor and Resizing
Load Factor: The ratio of the number of elements to the capacity of the bucket array.
Default load factor: 0.75.
When the load factor exceeds this value, resizing is triggered.
Resizing Process:
Create a new array with double the capacity.
Rehash all entries from the old array to the new one.
This ensures even distribution across buckets.
Operations
Insertion
Compute the hash code of the key using
hashCode()
.Calculate the index using the hash code and the bucket array size.
If no entry exists at the index, the new key-value pair is added.
If a collision occurs:
Traverse the bucket’s linked list or tree to check if the key already exists.
If the key exists, update its value.
Otherwise, append a new node to the list (or add it to the tree).
Retrieval
Compute the hash code of the key and calculate the bucket index.
Traverse the linked list or tree at the calculated index to find the node with the matching key.
Return the value associated with the key, or
null
if not found.
Resizing
When the
HashMap
exceeds its load factor (default: 0.75), the bucket array is resized to double its previous size.All existing entries are rehashed and redistributed into the new array.
This is an expensive operation, as it involves recomputing indices for all keys.
Key Methods
Method
Description
put(K key, V value)
Associates the specified value with the specified key.
get(Object key)
Returns the value associated with the key or null
if the key is not present.
remove(Object key)
Removes the key-value pair for the specified key.
containsKey(Object key)
Checks if the map contains the specified key.
containsValue(Object value)
Checks if the map contains the specified value.
keySet()
Returns a Set
view of the keys contained in the map.
values()
Returns a Collection
view of the values.
entrySet()
Returns a Set
view of the key-value pairs.
size()
Returns the number of key-value mappings in the map.
isEmpty()
Checks if the map is empty.
clear()
Removes all key-value pairs.
Big-O for Operations
Java 7 & Before → Worst case O(n) due to linked list collisions.
Java 8+ (Tree Buckets) → Worst case O(log n) (not O(n) anymore) when many collisions happen.
Operation
Average Case
Worst Case (Java 8+)
Get/Put (No Collision)
O(1)
O(log n) (with tree-based bucket)
Remove
O(1)
O(log n)
Contains Key/Value
O(1)
O(log n)
Iteration
O(n)
O(n)
Limitations
Not thread-safe (use
ConcurrentHashMap
for thread safety).Performance degrades with a poor hash function or excessive collisions.
Resizing can be computationally expensive.
Real-World Usage
Caching Systems: Store frequently accessed data for quick retrieval.
Configuration Maps: Store application settings as key-value pairs.
Data Deduplication: Quickly identify duplicates in large datasets.
Examples
1. Basic Operations
2. Iteration
3. Handling Null Keys and Values
4. Custom Key with Overridden hashCode()
and equals()
hashCode()
and equals()
Last updated
Was this helpful?