HashMap

About

HashMap is a part of the java.util package and is used to store key-value pairs. It is based on hashing and provides constant-time performance for basic operations like insertion, deletion, and lookup under ideal conditions.

Keys are unique, while values can be duplicated.
Allows null as a key (only one null key is allowed) and multiple null values.
It is unsynchronized and not thread-safe by default.

Features

Key-Value Mapping: Stores data as key-value pairs where the key is unique.
Fast Access: O(1) average time complexity for most operations.
Dynamic Resizing: Automatically resizes when the number of entries exceeds the load factor.
Allows Null: Accepts one null key and multiple null values.
Order: Does not guarantee any order of keys or values.
Non-Synchronized: Use Collections.synchronizedMap() for thread-safe operations or ConcurrentHashMap for better performance.
Customizable Hashing: Custom objects can be used as keys by overriding hashCode() and equals().

Internal Working

Java's HashMap works on the principle of hashing and uses an array of linked lists (buckets) to store key-value pairs. When a key is inserted, its hash code is computed and mapped to an index in the internal array. If multiple keys hash to the same index (collision), entries are stored in a linked list (Java 7) or balanced tree (TreeNode) if collisions exceed a threshold (Java 8+). get() and put() operations ideally run in O(1) time but degrade to O(log n) in worst cases. Resizing occurs when the load factor (default 0.75) is exceeded, doubling the bucket array size and rehashing entries.

Data Structure

Internally, HashMap maintains an array of buckets, where each bucket is either:

A Linked List (for low collisions).
A Red-Black Tree (when many keys collide, improving worst-case performance).

The array index for storing elements is determined using the hash function.

// Internal structure of HashMap
transient Node<K,V>[] table;

Each bucket is an instance of Node<K, V>, defined as:

static class Node<K,V> implements Map.Entry<K,V> {
    final int hash;   // Precomputed hash value of the key
    final K key;      // Key
    V value;          // Associated value
    Node<K,V> next;   // Next node in the linked list (for collisions)
}

Hashing

When a key-value pair is inserted, the key’s hashCode() method generates an integer hash code. The hash code is then compressed into a bucket index using this formula:

index=hash & (capacity−1)

hash is the computed hash of the key (after applying hashCode() and additional processing).
capacity is the size of the internal bucket array (default: 16).
Bitwise AND (&) with (capacity - 1) ensures fast index computation while distributing entries evenly across the array.
This approach works efficiently because HashMap always maintains capacity as a power of 2, ensuring that capacity - 1 is a bitmask (all 1s in binary).
Instead of using modulo (%) i.e. % capacity, which involves division (expensive), HashMap uses the faster bitwise AND (&) operation.
This is based on the property that: x % 2^n = x & (2^n−1).Thus, for power-of-2 capacities, hash & (capacity - 1) is functionally equivalent to hash % capacity, but much faster.
Why capacity - 1?
Java always keeps capacity as a power of 2 (e.g., 16, 32, 64, etc.). When you subtract 1 from a power of 2, you get a bitmask with all bits set to 1.

Capacity (2^n)

Binary (capacity)

capacity - 1

Binary (capacity - 1)

8 (2^3)

0000 1000 (8)

0000 0111 (7)

16 (2^4)

0001 0000 (16)

0000 1111 (15)

32 (2^5)

0010 0000 (32)

0001 1111 (31)

Since capacity - 1 has all lower bits set to 1, performing bitwise AND (&) with hash extracts only the lower bits.
This is equivalent to hash % capacity but much faster because division (%) is slow compared to bitwise operations.

Example: HashMap Index Calculation

Let's say we have a key with hashCode = 10, and the HashMap has a capacity of 8`.

Step 1: Compute Index Using Bitwise AND

int hash = 10;             // Example hash code (1010 in binary)
int capacity = 8;          // Power of 2
int index = hash & (capacity - 1);  // Equivalent to hash % capacity

System.out.println("Bucket Index: " + index);

Step 2: Binary Breakdown

hashCode  =  10  =  1010 (binary)
capacity  =   8  =  1000 (binary)
capacity-1 =  7  =  0111 (binary)
----------------------------------
Bitwise AND: 1010  (10)
           & 0111  ( 7)
           ----------------
             0010  ( 2)

Result: The bucket index is 2

Buckets

Each bucket in the array stores a linked list or a balanced binary tree (Java 8+).
If two keys hash to the same index, they are stored in the same bucket, forming a collision chain. Note that two different keys could have the same hash code, as there may be an infinite number of keys and a finite number of integers.

Collision Handling

Separate Chaining: Colliding keys are stored as nodes in a linked list within the same bucket.
Treeify (Java 8+): When a bucket's size exceeds a threshold (default: 8), the list is converted into a red-black tree for faster lookup.

Treeification in Java 8+

Why Treeify? To improve lookup time in case of many collisions. A linked list traversal is O(n), while a red-black tree lookup is O(log n).
How Treeify Works:
- When the number of nodes in a bucket exceeds a threshold (TREEIFY_THRESHOLD = 8), the linked list is converted into a red-black tree.
- If the size of the bucket reduces (due to deletions), the tree is converted back to a linked list.

Load Factor and Resizing

Load Factor: The ratio of the number of elements to the capacity of the bucket array.
```
loadFactor = size / capacity
```
- Default load factor: 0.75.
- When the load factor exceeds this value, resizing is triggered.
Resizing Process:
1. Create a new array with double the capacity.
2. Rehash all entries from the old array to the new one.
3. This ensures even distribution across buckets.

Operations

Insertion

Compute the hash code of the key using hashCode().
Calculate the index using the hash code and the bucket array size.
If no entry exists at the index, the new key-value pair is added.
If a collision occurs:
- Traverse the bucket’s linked list or tree to check if the key already exists.
- If the key exists, update its value.
- Otherwise, append a new node to the list (or add it to the tree).

Retrieval

Compute the hash code of the key and calculate the bucket index.
Traverse the linked list or tree at the calculated index to find the node with the matching key.
Return the value associated with the key, or null if not found.

Resizing

When the HashMap exceeds its load factor (default: 0.75), the bucket array is resized to double its previous size.
All existing entries are rehashed and redistributed into the new array.
- This is an expensive operation, as it involves recomputing indices for all keys.

Key Methods

Method

Description

put(K key, V value)

Associates the specified value with the specified key.

get(Object key)

Returns the value associated with the key or null if the key is not present.

remove(Object key)

Removes the key-value pair for the specified key.

containsKey(Object key)

Checks if the map contains the specified key.

containsValue(Object value)

Checks if the map contains the specified value.

keySet()

Returns a Set view of the keys contained in the map.

values()

Returns a Collection view of the values.

entrySet()

Returns a Set view of the key-value pairs.

size()

Returns the number of key-value mappings in the map.

isEmpty()

Checks if the map is empty.

clear()

Removes all key-value pairs.

Big-O for Operations

Java 7 & Before → Worst case O(n) due to linked list collisions.
Java 8+ (Tree Buckets) → Worst case O(log n) (not O(n) anymore) when many collisions happen.

Operation

Average Case

Worst Case (Java 8+)

Get/Put (No Collision)

O(1)

O(log n) (with tree-based bucket)

Remove

O(1)

O(log n)

Contains Key/Value

O(1)

O(log n)

Iteration

O(n)

Limitations

Not thread-safe (use ConcurrentHashMap for thread safety).
Performance degrades with a poor hash function or excessive collisions.
Resizing can be computationally expensive.

Real-World Usage

Caching Systems: Store frequently accessed data for quick retrieval.
Configuration Maps: Store application settings as key-value pairs.
Data Deduplication: Quickly identify duplicates in large datasets.

Examples

1. Basic Operations

import java.util.HashMap;

public class HashMapExample {
    public static void main(String[] args) {
        HashMap<Integer, String> map = new HashMap<>();

        // Adding elements
        map.put(1, "Alice");
        map.put(2, "Bob");
        map.put(3, "Charlie");
        System.out.println(map); // Output: {1=Alice, 2=Bob, 3=Charlie}

        // Accessing elements
        System.out.println("Value for key 2: " + map.get(2)); // Output: Value for key 2: Bob

        // Check presence of key or value
        System.out.println("Contains key 3: " + map.containsKey(3)); // Output: Contains key 3: true
        System.out.println("Contains value 'Bob': " + map.containsValue("Bob")); // Output: Contains value 'Bob': true

        // Removing an element
        map.remove(1);
        System.out.println(map); // Output: {2=Bob, 3=Charlie}
    }
}

2. Iteration

import java.util.HashMap;
import java.util.Map;

public class IterationExample {
    public static void main(String[] args) {
        HashMap<String, Integer> map = new HashMap<>();
        map.put("Apple", 3);
        map.put("Banana", 5);
        map.put("Cherry", 2);

        // Iterating using entrySet
        for (Map.Entry<String, Integer> entry : map.entrySet()) {
            System.out.println(entry.getKey() + " -> " + entry.getValue());
            // Output:
            // Apple -> 3
            // Banana -> 5
            // Cherry -> 2
        }

        // Iterating using keySet
        for (String key : map.keySet()) {
            System.out.println("Key: " + key + ", Value: " + map.get(key));
            // Output:
            // Key: Apple, Value: 3
            // Key: Banana, Value: 5
            // Key: Cherry, Value: 2
        }
    }
}

3. Handling Null Keys and Values

import java.util.HashMap;

public class NullHandlingExample {
    public static void main(String[] args) {
        HashMap<String, String> map = new HashMap<>();
        map.put(null, "NullKey"); // Adding a null key
        map.put("Test", null);   // Adding a null value
        System.out.println(map); // Output: {null=NullKey, Test=null}
    }
}

4. Custom Key with Overridden `hashCode()` and `equals()`

import java.util.HashMap;
import java.util.Objects;

class Employee {
    int id;
    String name;

    Employee(int id, String name) {
        this.id = id;
        this.name = name;
    }

    @Override
    public int hashCode() {
        return Objects.hash(id, name);
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Employee employee = (Employee) o;
        return id == employee.id && Objects.equals(name, employee.name);
    }

    @Override
    public String toString() {
        return "Employee{id=" + id + ", name='" + name + "'}";
    }
}

public class CustomKeyExample {
    public static void main(String[] args) {
        HashMap<Employee, String> map = new HashMap<>();
        map.put(new Employee(1, "Alice"), "HR");
        map.put(new Employee(2, "Bob"), "Finance");
        System.out.println(map);
        // Output: {Employee{id=1, name='Alice'}=HR, Employee{id=2, name='Bob'}=Finance}
    }
}

PreviousMap NextHashtable

Last updated 5 months ago

About

Features

Internal Working

Data Structure

Hashing

Example: HashMap Index Calculation

Buckets

Collision Handling

Load Factor and Resizing

Operations

Insertion

Retrieval

Resizing

Key Methods

Big-O for Operations

Limitations

Real-World Usage

Examples

1. Basic Operations

2. Iteration

3. Handling Null Keys and Values

4. Custom Key with Overridden hashCode() and equals()

4. Custom Key with Overridden `hashCode()` and `equals()`