Cache Design
1. Estimate Cache Memory needed for caching 1 million entries of a Java object
About
In a typical Spring Boot microservice, suppose we are caching frequently accessed data using an in-memory cache like ConcurrentHashMap, Caffeine, or Spring’s built-in caching abstraction. Each cached entry consists of:
Key: A unique identifier (e.g., String or UUID)
Value: A Java object with approximately 10–15 fields
Let’s estimate how much JVM heap memory will be required to cache 1 million such objects.
Breakdown of Memory Usage
1. Object Overhead (JVM Internal Structure)
Every object in the JVM has some built-in memory overhead, regardless of the fields it contains. This includes:
Object Header
12 bytes (on 64-bit JVM with compressed oops)
Stores object metadata like identity hash, GC info, and type pointer.
Object Padding
Up to 8 bytes (to align to 8-byte boundaries)
JVM aligns object size to 8-byte multiples for efficient memory access.
Reference Fields
4 or 8 bytes each
Depending on whether compressed oops are enabled.
Internal Pointers
Adds indirection and slight overhead when objects contain references to other objects.
2. Estimated Field Composition (Example: 10 Fields)
Let's assume the object being cached looks like this:
UUID
1
16 bytes
Backed by two long values
String
4
40–80 bytes each
Depends on string length; average 60 bytes
int
2
4 bytes each
Total: 8 bytes
boolean
2
1 byte each (but rounded up)
Total padded to 4–8 bytes
long
1
8 bytes
64-bit long
Total estimated size of fields:
UUID= 16 bytes4 Strings= ~240 bytes2 int= 8 bytes2 boolean(padded) = ~8 bytes1 long= 8 bytes
→ Total = ~280–320 bytes for fields
3. Additional Memory Per Entry (Cache-Level Overhead)
When storing these objects in a map or cache (e.g., ConcurrentHashMap or Caffeine), we also need to account for:
Key Object
~40–60 bytes
UUID or String, including its internal character array
Map Entry Overhead
~32–48 bytes
Bucket pointer, hash, references
Value Object
~300–350 bytes
As estimated above
References
~8–16 bytes
Reference to value and key
Total per cache entry: ~400–500 bytes conservatively In worst cases, may grow up to 550–600 bytes.
4. Total Estimated Memory Usage
To calculate total memory for 1 million entries:
Buffering for Safety
Due to GC metadata, alignment, fragmentation, and fluctuations in field size:
Add 30–40% buffer
Recommended Heap Size: 700–800 MB
5. How to Measure Actual Memory Usage (Empirical Approach) ?
To validate the estimate with real data:
Write a test class to load 1 million objects into a
Map:Use a Java profiler:
JVisualVM: Attach to the JVM and inspect heap usage.
Eclipse MAT (Memory Analyzer Tool): For analyzing heap dumps.
YourKit or JProfiler: For in-depth memory profiling.
Compare memory usage before and after population.
2. LRU Cache
Design and build a "least recently used" cache, which evicts the least recently used item. The cache should map from keys to values (allowing us to insert and retrieve a value associated with a particular key) and be initialized with a max size. When it is full, it should evict the least recently used item.
We need to design a cache that:
Stores key-value pairs.
Has a maximum size (capacity).
When inserting a new item and the cache is full, it must evict the least recently used (LRU) item.
Both
get(key)andput(key, value)operations must be done in O(1) time.
To support O(1) operations:
Use a HashMap to store key → node mappings.
Use a Doubly Linked List to track usage order (most recently used at head, least at tail).
Whenever:
We
get(key): Move the node to the front (MRU).We
put(key, value):If key exists: Update and move to front.
If key doesn’t exist and at capacity: Remove tail (LRU), insert new node at head.
Last updated