Cache Design

1. Estimate Cache Memory needed for caching 1 million entries of a Java object

About

In a typical Spring Boot microservice, suppose we are caching frequently accessed data using an in-memory cache like ConcurrentHashMap, Caffeine, or Spring’s built-in caching abstraction. Each cached entry consists of:

  • Key: A unique identifier (e.g., String or UUID)

  • Value: A Java object with approximately 10–15 fields

Let’s estimate how much JVM heap memory will be required to cache 1 million such objects.

Breakdown of Memory Usage

1. Object Overhead (JVM Internal Structure)

Every object in the JVM has some built-in memory overhead, regardless of the fields it contains. This includes:

Component
Size (Bytes)
Description

Object Header

12 bytes (on 64-bit JVM with compressed oops)

Stores object metadata like identity hash, GC info, and type pointer.

Object Padding

Up to 8 bytes (to align to 8-byte boundaries)

JVM aligns object size to 8-byte multiples for efficient memory access.

Reference Fields

4 or 8 bytes each

Depending on whether compressed oops are enabled.

Internal Pointers

Adds indirection and slight overhead when objects contain references to other objects.

Typical overhead per object (excluding fields): ~12–16 bytes If references are involved, field alignment and padding increase usage further.

2. Estimated Field Composition (Example: 10 Fields)

Let's assume the object being cached looks like this:

Field Type
Count
Estimated Size
Notes

UUID

1

16 bytes

Backed by two long values

String

4

40–80 bytes each

Depends on string length; average 60 bytes

int

2

4 bytes each

Total: 8 bytes

boolean

2

1 byte each (but rounded up)

Total padded to 4–8 bytes

long

1

8 bytes

64-bit long

Total estimated size of fields:

  • UUID = 16 bytes

  • 4 Strings = ~240 bytes

  • 2 int = 8 bytes

  • 2 boolean (padded) = ~8 bytes

  • 1 long = 8 bytes

→ Total = ~280–320 bytes for fields

3. Additional Memory Per Entry (Cache-Level Overhead)

When storing these objects in a map or cache (e.g., ConcurrentHashMap or Caffeine), we also need to account for:

Component
Estimated Size
Description

Key Object

~40–60 bytes

UUID or String, including its internal character array

Map Entry Overhead

~32–48 bytes

Bucket pointer, hash, references

Value Object

~300–350 bytes

As estimated above

References

~8–16 bytes

Reference to value and key

Total per cache entry: ~400–500 bytes conservatively In worst cases, may grow up to 550–600 bytes.

4. Total Estimated Memory Usage

To calculate total memory for 1 million entries:

Buffering for Safety

Due to GC metadata, alignment, fragmentation, and fluctuations in field size:

  • Add 30–40% buffer

  • Recommended Heap Size: 700–800 MB

5. How to Measure Actual Memory Usage (Empirical Approach) ?

To validate the estimate with real data:

  1. Write a test class to load 1 million objects into a Map:

  2. Use a Java profiler:

    • JVisualVM: Attach to the JVM and inspect heap usage.

    • Eclipse MAT (Memory Analyzer Tool): For analyzing heap dumps.

    • YourKit or JProfiler: For in-depth memory profiling.

  3. Compare memory usage before and after population.

2. LRU Cache

Design and build a "least recently used" cache, which evicts the least recently used item. The cache should map from keys to values (allowing us to insert and retrieve a value associated with a particular key) and be initialized with a max size. When it is full, it should evict the least recently used item.

We need to design a cache that:

  1. Stores key-value pairs.

  2. Has a maximum size (capacity).

  3. When inserting a new item and the cache is full, it must evict the least recently used (LRU) item.

  4. Both get(key) and put(key, value) operations must be done in O(1) time.

To support O(1) operations:

  • Use a HashMap to store key → node mappings.

  • Use a Doubly Linked List to track usage order (most recently used at head, least at tail).

Whenever:

  • We get(key): Move the node to the front (MRU).

  • We put(key, value):

    • If key exists: Update and move to front.

    • If key doesn’t exist and at capacity: Remove tail (LRU), insert new node at head.

Last updated