Caching is one of the most powerful techniques to improve application performance. This post explains what caching is, why it matters, and how to implement it effectively in real-world systems.
Caching: Making Your Applications Blazing Fast
Introduction
Let me be real with you.
You built an amazing backend. Your APIs are clean, your database queries are optimized, and everything works perfectly — until traffic increases.
Suddenly:
- API response times spike
- Database CPU goes through the roof
- Users start complaining about slow load times
What went wrong? You forgot caching.
Caching feels "optional" early on, but becomes absolutely critical as your app grows. The best part? Once you understand the fundamentals, it's not even that hard to implement.
Let's break it down.
The Core Idea: Why Caching Exists
Caching means storing frequently accessed data in a faster storage layer so you don't have to recompute or re-fetch it on every request.
Think of it like this:
| Layer | Speed | Durability |
|---|---|---|
| Database | Slow | Persistent |
| Cache | Fast | Temporary |
Instead of hitting your database every time, you follow this flow:
- Do I already have this data in cache?
- Yes → return it instantly ⚡
- No → fetch from DB, store in cache, then return
The Problem Without Caching
Consider a simple API endpoint:
def get_user(user_id):
user = db.query(User).filter(User.id == user_id).first()
return userNow imagine 10,000 users hit this endpoint per minute, all requesting the same data. That means:
- 10,000 redundant DB queries
- Wasted CPU cycles
- Increased latency for everyone
This is called redundant computation — and it's entirely avoidable.
Adding a Cache Layer
def get_user(user_id):
cache_key = f"user:{user_id}"
# Check cache first
cached_user = redis.get(cache_key)
if cached_user:
return cached_user
# Cache miss — fetch from DB
user = db.query(User).filter(User.id == user_id).first()
# Store in cache for 60 seconds
redis.set(cache_key, user, ex=60)
return userWhat happens now:
- First request → hits the database
- Next 9,999 requests → served from cache ⚡
Your database breathes. Your app becomes fast.
Types of Caching
1. In-Memory Cache
Stored directly inside your application's memory.
Examples: Python dict, functools.lru_cache
| Pros | Cons |
|---|---|
| Extremely fast | Not shared across instances |
| No network call | Lost on restart |
2. Distributed Cache
Stored in a shared external system, accessible by all your app instances.
Examples: Redis, Memcached
| Pros | Cons |
|---|---|
| Shared across servers | Slight network latency |
| Scalable | Requires external dependency |
3. CDN Caching
Used for static assets — images, CSS, JS files. CDNs cache content at edge nodes geographically close to users, dramatically reducing load times.
4. Database Query Caching
Caches the results of expensive or frequently-run queries. Ideal when:
- The query is computationally heavy
- The underlying data doesn't change often
Cache Invalidation — The Hard Part 😅
There's a famous quote in computer science:
"There are only two hard things in Computer Science: cache invalidation and naming things." — Phil Karlton
Why is it hard? Because cached data can become stale — out of sync with the source of truth.
Strategies
1. Time-Based Expiry (TTL)
redis.set("key", value, ex=60) # expires after 60 secondsSimple, effective, and the most common approach.
2. Manual Invalidation
Delete the cached entry whenever the underlying data changes:
def update_user(user_id, data):
db.update(user_id, data)
redis.delete(f"user:{user_id}")Ensures freshness, but requires discipline across your codebase.
3. Write-Through Cache
Update the cache and the database simultaneously on every write. Keeps data consistent, but adds write latency.
4. Write-Back (Write-Behind)
Write to the cache first; sync to the database asynchronously. Best for high-throughput systems where slight delays are acceptable.
Common Caching Patterns
Cache Aside (Lazy Loading)
The most widely used pattern.
Read request → Check cache → Hit? Return. Miss? Fetch DB → Store in cache → Return.
Best for read-heavy workloads.
Write Through
Write request → Update cache → Update DB
Best for consistency-critical applications.
Write Back
Write request → Update cache → (DB updated later, asynchronously)
Best for high-performance, write-heavy systems.
When NOT to Cache
Caching is powerful, but it's not always the right tool.
Avoid caching when:
- Data changes very frequently
- Data must always be real-time accurate (e.g., bank balances, stock prices)
- The overhead of maintaining the cache outweighs the benefit
Real-World Use Cases
| Use Case | Caching Approach |
|---|---|
| User sessions | Redis (distributed) |
| API responses | Cache aside with TTL |
| E-commerce product listings | Redis with manual invalidation |
| Leaderboards | In-memory or Redis sorted sets |
Summary
Caching is one of the highest-leverage performance optimizations you can make. Start simple — a Redis cache aside with a sensible TTL gets you 80% of the benefit with 20% of the complexity. As your system grows, layer in more sophisticated patterns like write-through or write-back where appropriate.
The database should be the last resort, not the first call.
Ready to level up?
Join hundreds of engineers mastering high-stakes system design through real-world simulations.
Become a Senior Backend Engineer