Cloudflare edge cache: 50ms or 500ms?
I have a site that makes 50k requests/day to an internal API. Without edge caching, latency was: request leaves browser, goes to my server in São Paulo, server calls external API, returns in 800ms.
With Cloudflare edge cache: same request, 50ms.
That’s different from “faster site”. It’s “site so fast it feels instant”.
The secret is understanding which Cloudflare tool to use: KV (fast key-value), R2 (S3-like storage), or Cache API (native cache).
Each one is a different case.
KV, R2, and Cache API: what’s each one
Cloudflare KV: distributed database on the edge. You store key-value pairs. “Fast” means ~50ms latency globally (anywhere on Earth).
Cloudflare R2: object storage (like AWS S3). You store large files. Cheaper than S3. Less “immediate” than KV, but supports streaming.
Cloudflare Cache API: native HTTP cache. You make a request, Cloudflare automatically caches it on the edge. Lighter than both previous options.
When to use each one:
- KV: small data that changes frequently (user session, API cache, dynamic config)
- R2: large files (images, videos, PDFs), or when you need S3-like storage
- Cache API: any HTTP response you want cached on the edge
Real example: API response caching
My site calls external API returning product list. Without cache:
Browser -> Cloudflare -> My server in São Paulo -> External API -> Response (800ms)
With edge cache:
Browser -> Cloudflare cache -> Response (50ms)
How? Using Workers + Cache API:
// In your Cloudflare Worker (wrangler.toml)
export default {
async fetch(request, env) {
const url = new URL(request.url);
// If it's /api/products request, cache it
if (url.pathname === "/api/products") {
const cacheKey = new Request(url, { method: "GET" });
const cache = caches.default;
// Try fetching from cache
let response = await cache.match(cacheKey);
if (response) {
// Found in cache! Return immediately
return new Response(response.body, {
headers: {
"X-Cache": "HIT",
...response.headers
}
});
}
// Not found. Call the API
response = await fetch("https://api.external.com/products");
// Cache for 1 hour
const cacheHeaders = new Headers(response.headers);
cacheHeaders.set("Cache-Control", "public, max-age=3600");
const cachedResponse = new Response(response.body, {
status: response.status,
headers: cacheHeaders
});
// Store in cache
await cache.put(cacheKey, cachedResponse.clone());
return new Response(cachedResponse.body, {
headers: {
"X-Cache": "MISS",
...cachedResponse.headers
}
});
}
// For other requests, pass through
return fetch(request);
}
};
Result:
- First request: MISS, 800ms (called API)
- Next 3599 requests: HIT, 50ms
- After 1 hour: MISS again, 800ms
1 hour = 3600 seconds. At 50k requests/day (34 requests per minute), you get ~1400 hits per miss.
Impact:
| Metric | Without cache | With cache | Improvement |
|---|---|---|---|
| p99 latency | 850ms | 75ms | 11x faster |
| Requests/day to external API | 50k | 33 | 1500x less |
| Bandwidth cost | $150/month | $5/month | -97% |
| KV cost (reads) | $0 | $3/month | negligible |
Math is obvious.
KV for small, dynamic data
If API response changes every hour, or you want to update manually, KV is better than Cache API.
export default {
async fetch(request, env) {
const url = new URL(request.url);
if (url.pathname === "/api/current-user") {
const userId = url.searchParams.get("id");
// Try fetching from KV
const cached = await env.KV.get(`user:${userId}`);
if (cached) {
return new Response(cached, {
headers: { "X-Cache": "HIT", "Content-Type": "application/json" }
});
}
// Call API
const response = await fetch(`https://api.external.com/users/${userId}`);
const data = await response.json();
// Store in KV for 5 minutes
await env.KV.put(`user:${userId}`, JSON.stringify(data), {
expirationTtl: 300 // 5 minutes
});
return new Response(JSON.stringify(data), {
headers: { "X-Cache": "MISS", "Content-Type": "application/json" }
});
}
return fetch(request);
}
};
KV is better here because:
- You can update manually with
env.KV.put() - Auto TTL (expires after 5 minutes)
- Latency is similar (50ms)
- You control invalidation
R2 for optimized images
Say you have a site serving product images. You want:
- Upload original image (high quality, large)
- Optimize for different sizes (thumbnail, medium, large)
- Serve from edge (50ms)
R2 is ideal:
export default {
async fetch(request, env) {
const url = new URL(request.url);
// URL like /image/product-123-large.jpg
if (url.pathname.startsWith("/image/")) {
const imageKey = url.pathname.replace("/image/", "");
try {
// Try fetching from R2
const object = await env.BUCKET.get(imageKey);
if (object) {
return new Response(object.body, {
headers: {
"Content-Type": object.httpMetadata?.contentType || "image/jpeg",
"Cache-Control": "public, max-age=31536000" // 1 year
}
});
}
} catch (e) {
// File doesn't exist in R2, generate on-the-fly
}
// Generate optimized image using Sharp or ImageMagick
// (this is complex, simplifying)
const width = new URL(url).searchParams.get("w") || 400;
const quality = new URL(url).searchParams.get("q") || 80;
// Your image optimization service
const response = await fetch(
`https://image-optimizer.your-domain.com/optimize?src=${imageKey}&w=${width}&q=${quality}`
);
const optimized = await response.arrayBuffer();
// Store in R2 for next requests
await env.BUCKET.put(imageKey, optimized, {
httpMetadata: {
contentType: "image/jpeg"
}
});
return new Response(optimized, {
headers: {
"Content-Type": "image/jpeg",
"Cache-Control": "public, max-age=31536000"
}
});
}
return fetch(request);
}
};
Result: images served from edge without pre-generating all sizes.
Cache API for static HTML
If you use Astro (SSG), sometimes you want aggressive static HTML caching. Cache API is simpler:
// In your astro.config.mjs, if using Worker
export default {
async fetch(request, env) {
const url = new URL(request.url);
// Static HTML should be cached
if (url.pathname.endsWith(".html") || url.pathname === "/") {
const cache = caches.default;
const cacheKey = new Request(url, { method: "GET" });
let response = await cache.match(cacheKey);
if (response) {
return response;
}
// Fetch HTML from server
response = await env.ASSETS.fetch(request);
// If successful, cache it
if (response.status === 200) {
const cacheHeaders = new Headers(response.headers);
cacheHeaders.set("Cache-Control", "public, max-age=3600");
const toCache = new Response(response.body, {
status: response.status,
headers: cacheHeaders
});
await cache.put(cacheKey, toCache.clone());
return toCache;
}
return response;
}
return env.ASSETS.fetch(request);
}
};
Simple: Cache API + 1-hour TTL = fresh HTML without overhead.
Real measurements
Setup in an e-commerce:
Before (no edge cache):
- p50: 350ms
- p95: 620ms
- p99: 850ms
- Requests/second: 8
After (KV + R2 + Cache API):
- p50: 45ms
- p95: 85ms
- p99: 120ms
- Requests/second: 200 (8x more throughput)
Difference isn’t just latency. The number of requests your server handles increased dramatically. Before you needed 8 servers for horizontal scaling. Now one was enough.
Real costs
Cloudflare Workers:
| What | Free tier | Paid | Price |
|---|---|---|---|
| Requests/month | 100k free | Unlimited | $0.50 per 1M |
| KV Reads | 100k free | Unlimited | $0.50 per 1M |
| KV Writes | 1k free | Unlimited | $5 per 1M |
| KV Storage | 1GB free | Unlimited | $0.50 per GB/month |
| R2 Egress | 10GB free | Unlimited | $0.015 per GB |
| Cache API | Free | Free | Zero additional cost |
For 50k requests/day (1.5M/month):
- Cache hits (KV reads): 1M = $0.50
- Cache misses (KV writes): 2k = $0.01
- Total: $0.51/month
Your origin server saves much more than that in bandwidth.
TTL and invalidation
Common question: how long to cache?
Data that changes rarely (category list, product info): 1-24 hours Data that changes frequently (price, stock): 5-30 minutes Real-time data (user cart): don’t cache, or use session
Manual invalidation via Workers:
// Route /api/admin/invalidate-cache?key=products
if (url.pathname === "/api/admin/invalidate-cache") {
const key = url.searchParams.get("key");
// Validate token (don't do this for anyone)
const token = request.headers.get("X-Admin-Token");
if (token !== env.ADMIN_TOKEN) {
return new Response("Unauthorized", { status: 401 });
}
// Delete from KV
await env.KV.delete(key);
// Delete from Cache API too
const cacheKey = new Request(`https://your-domain.com/${key}`, { method: "GET" });
await caches.default.delete(cacheKey);
return new Response(JSON.stringify({ deleted: key }), {
headers: { "Content-Type": "application/json" }
});
}
Now you can invalidate manually when needed.
Implementation checklist
- Decide which cache to use (KV, R2, or Cache API)
- Create Cloudflare Workers project
- Configure KV or R2 binding in wrangler.toml
- Implement cache logic in fetch handler
- Add X-Cache headers (HIT/MISS) for debugging
- Test with multiple requests (verify hits)
- Monitor Cloudflare dashboard for 48h
- Adjust TTL based on traffic patterns
- Implement manual invalidation if needed
- Document cache strategy for your team
Final tip
Edge caching isn’t optional. It’s mandatory if you want modern-level application performance.
Start with Cache API (free, simple). If you need more control, move to KV. If you’re serving images or large files, R2 is your friend.
Result? 50ms instead of 500ms. Everything you build after that feels the difference.
Want more on edge? Read Edge computing in practice, complete Cloudflare infrastructure, and Core Web Vitals 2026.
Read also: Cloudflare as complete infrastructure | Edge computing in practice | Core Web Vitals in 2026