Caching a Next.js API proxy layer with Redis (ioredis) — what to cache, what not to.

Every Next.js app eventually talks to a slower upstream API. A thin proxy layer with Redis in front can cut response times from hundreds of milliseconds to single digits — but only if you cache the right things. Here is a practical, numbers-driven guide to deciding what belongs in Redis and what does not.

Hossam Mohamed

April 25, 2026

About Why a proxy layer at all

1- Next.js App Router already has fetch-level caching, but it is tied to the render cycle and hard to invalidate precisely across instances. 2- A dedicated proxy (route handlers or a small Node service) gives you one place to add Redis, rate limits, auth headers, and schema validation — and lets you share the cache between SSR, client fetches, and background jobs.

Next.js API proxy layer with Redis caching

1- What you should cache

Good cache candidates are read-heavy, change slowly, and are expensive or slow to regenerate. These are the wins that show up clearly in p95 latency graphs.

Strong candidates:

✅ Reference data that rarely changes — lookup lists, taxonomy, country/city data, unit types. TTL of hours or days is fine and the hit rate is usually above 95%.
✅ Expensive aggregations and search results — filtered listing pages, map cluster tiles, or any endpoint that joins several upstream calls. Cache by a stable key built from the normalized query string.
✅ Public, non-personalized responses — anything the same for every visitor. These give the biggest ratio of Redis bytes to milliseconds saved and are safe to share across users.
✅ Upstream responses you are about to transform — cache the raw upstream payload, not the transformed one, so one cache entry can serve multiple callers with different shaping needs.

Example: cached upstream call with ioredis

ioredis cache-aside helper wrapping an upstream fetch

Typical before/after latency

p50 and p95 latency before and after adding Redis

2- What you should NOT cache

Caching the wrong response is worse than no cache — it leaks data, masks bugs, and makes incidents harder to diagnose. The rule of thumb is: if a wrong cached value can hurt a user, do not cache it.

Anti-patterns to avoid:

❌ Authenticated, per-user responses without a user-scoped key — account details, favorites, reservation status. If you must cache them, include the user id in the key and use very short TTLs.
❌ Write endpoints or anything that mutates state. Caching POST, PUT, PATCH, or DELETE responses is almost always a bug, even when the response body looks safe to reuse.
❌ Responses that embed short-lived tokens, signed URLs, or CSRF values. A TTL longer than the token lifetime turns a cache hit into a broken request for the next user.
❌ Upstream errors. A 5xx or a timeout should never be stored as a successful cache entry — if you want to absorb upstream pain, cache a stale successful response and serve it with a short revalidation.
❌ Huge payloads that blow past your Redis memory budget. Measure the value:bytes ratio — a 2 MB response saving 80 ms is almost never worth the eviction pressure it creates.

Cache key shape for per-user data

3- TTL and invalidation — the part everyone gets wrong

Picking a TTL is not about guessing a number. It is about how fresh the data must look and how much load the upstream can take during a cache stampede.

Rules that hold up in production:

✅ Match TTL to tolerated staleness, not to traffic. A listing page that can be five minutes stale should use 300 seconds even if the endpoint is hit once per hour.
✅ Add jitter to TTLs so a thousand keys do not expire at the same second. A random ±10% window is enough to keep the upstream from being stampeded on a flash event.
✅ Prefer explicit invalidation on writes over short TTLs. Publishing a delete on a Redis channel when an admin edits a listing is cheaper than expiring the whole namespace every minute.
✅ Serve stale-while-revalidate from Redis when the upstream is down. A one-hour stale listing is almost always better than a 500 error page.

4- You cannot tune what you do not measure

Before you add a single cache, write down the numbers you expect to move. After the rollout, check them. If the hit rate is low or the latency did not change, the cache is paying rent for nothing.

Metrics worth shipping from day one:

✅ Hit rate per cache namespace — not global. A global number hides one bad key family dragging the average down.
✅ p50 and p95 latency on the proxy route, split by hit and miss. The gap between them is the value your cache is actually delivering.
✅ Upstream call count per minute. The whole point of the cache is to flatten this graph — if it did not flatten, something is wrong with your keys.
✅ Redis memory and eviction counters. An eviction storm is the earliest signal that you are caching something you should not be.

Leave message for me to get updates about my latest projects. or if you have any suggestion i will be happy to hear .