Back to blog
Performance

Cloudflare edge cache: 50ms or 500ms?

By Flávio Emanuel · · 11 min read

I have a site that makes 50k requests/day to an internal API. Without edge caching, latency was: request leaves browser, goes to my server in São Paulo, server calls external API, returns in 800ms.

With Cloudflare edge cache: same request, 50ms.

That’s different from “faster site”. It’s “site so fast it feels instant”.

The secret is understanding which Cloudflare tool to use: KV (fast key-value), R2 (S3-like storage), or Cache API (native cache).

Each one is a different case.

KV, R2, and Cache API: what’s each one

Cloudflare KV: distributed database on the edge. You store key-value pairs. “Fast” means ~50ms latency globally (anywhere on Earth).

Cloudflare R2: object storage (like AWS S3). You store large files. Cheaper than S3. Less “immediate” than KV, but supports streaming.

Cloudflare Cache API: native HTTP cache. You make a request, Cloudflare automatically caches it on the edge. Lighter than both previous options.

When to use each one:

  • KV: small data that changes frequently (user session, API cache, dynamic config)
  • R2: large files (images, videos, PDFs), or when you need S3-like storage
  • Cache API: any HTTP response you want cached on the edge

Real example: API response caching

My site calls external API returning product list. Without cache:

Browser -> Cloudflare -> My server in São Paulo -> External API -> Response (800ms)

With edge cache:

Browser -> Cloudflare cache -> Response (50ms)

How? Using Workers + Cache API:

// In your Cloudflare Worker (wrangler.toml)
export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    
    // If it's /api/products request, cache it
    if (url.pathname === "/api/products") {
      const cacheKey = new Request(url, { method: "GET" });
      const cache = caches.default;
      
      // Try fetching from cache
      let response = await cache.match(cacheKey);
      
      if (response) {
        // Found in cache! Return immediately
        return new Response(response.body, {
          headers: {
            "X-Cache": "HIT",
            ...response.headers
          }
        });
      }
      
      // Not found. Call the API
      response = await fetch("https://api.external.com/products");
      
      // Cache for 1 hour
      const cacheHeaders = new Headers(response.headers);
      cacheHeaders.set("Cache-Control", "public, max-age=3600");
      const cachedResponse = new Response(response.body, {
        status: response.status,
        headers: cacheHeaders
      });
      
      // Store in cache
      await cache.put(cacheKey, cachedResponse.clone());
      
      return new Response(cachedResponse.body, {
        headers: {
          "X-Cache": "MISS",
          ...cachedResponse.headers
        }
      });
    }
    
    // For other requests, pass through
    return fetch(request);
  }
};

Result:

  • First request: MISS, 800ms (called API)
  • Next 3599 requests: HIT, 50ms
  • After 1 hour: MISS again, 800ms

1 hour = 3600 seconds. At 50k requests/day (34 requests per minute), you get ~1400 hits per miss.

Impact:

MetricWithout cacheWith cacheImprovement
p99 latency850ms75ms11x faster
Requests/day to external API50k331500x less
Bandwidth cost$150/month$5/month-97%
KV cost (reads)$0$3/monthnegligible

Math is obvious.

KV for small, dynamic data

If API response changes every hour, or you want to update manually, KV is better than Cache API.

export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    
    if (url.pathname === "/api/current-user") {
      const userId = url.searchParams.get("id");
      
      // Try fetching from KV
      const cached = await env.KV.get(`user:${userId}`);
      if (cached) {
        return new Response(cached, { 
          headers: { "X-Cache": "HIT", "Content-Type": "application/json" } 
        });
      }
      
      // Call API
      const response = await fetch(`https://api.external.com/users/${userId}`);
      const data = await response.json();
      
      // Store in KV for 5 minutes
      await env.KV.put(`user:${userId}`, JSON.stringify(data), { 
        expirationTtl: 300 // 5 minutes
      });
      
      return new Response(JSON.stringify(data), {
        headers: { "X-Cache": "MISS", "Content-Type": "application/json" }
      });
    }
    
    return fetch(request);
  }
};

KV is better here because:

  • You can update manually with env.KV.put()
  • Auto TTL (expires after 5 minutes)
  • Latency is similar (50ms)
  • You control invalidation

R2 for optimized images

Say you have a site serving product images. You want:

  1. Upload original image (high quality, large)
  2. Optimize for different sizes (thumbnail, medium, large)
  3. Serve from edge (50ms)

R2 is ideal:

export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    
    // URL like /image/product-123-large.jpg
    if (url.pathname.startsWith("/image/")) {
      const imageKey = url.pathname.replace("/image/", "");
      
      try {
        // Try fetching from R2
        const object = await env.BUCKET.get(imageKey);
        if (object) {
          return new Response(object.body, {
            headers: {
              "Content-Type": object.httpMetadata?.contentType || "image/jpeg",
              "Cache-Control": "public, max-age=31536000" // 1 year
            }
          });
        }
      } catch (e) {
        // File doesn't exist in R2, generate on-the-fly
      }
      
      // Generate optimized image using Sharp or ImageMagick
      // (this is complex, simplifying)
      const width = new URL(url).searchParams.get("w") || 400;
      const quality = new URL(url).searchParams.get("q") || 80;
      
      // Your image optimization service
      const response = await fetch(
        `https://image-optimizer.your-domain.com/optimize?src=${imageKey}&w=${width}&q=${quality}`
      );
      
      const optimized = await response.arrayBuffer();
      
      // Store in R2 for next requests
      await env.BUCKET.put(imageKey, optimized, {
        httpMetadata: {
          contentType: "image/jpeg"
        }
      });
      
      return new Response(optimized, {
        headers: {
          "Content-Type": "image/jpeg",
          "Cache-Control": "public, max-age=31536000"
        }
      });
    }
    
    return fetch(request);
  }
};

Result: images served from edge without pre-generating all sizes.

Cache API for static HTML

If you use Astro (SSG), sometimes you want aggressive static HTML caching. Cache API is simpler:

// In your astro.config.mjs, if using Worker
export default {
  async fetch(request, env) {
    const url = new URL(request.url);
    
    // Static HTML should be cached
    if (url.pathname.endsWith(".html") || url.pathname === "/") {
      const cache = caches.default;
      const cacheKey = new Request(url, { method: "GET" });
      
      let response = await cache.match(cacheKey);
      if (response) {
        return response;
      }
      
      // Fetch HTML from server
      response = await env.ASSETS.fetch(request);
      
      // If successful, cache it
      if (response.status === 200) {
        const cacheHeaders = new Headers(response.headers);
        cacheHeaders.set("Cache-Control", "public, max-age=3600");
        const toCache = new Response(response.body, {
          status: response.status,
          headers: cacheHeaders
        });
        
        await cache.put(cacheKey, toCache.clone());
        return toCache;
      }
      
      return response;
    }
    
    return env.ASSETS.fetch(request);
  }
};

Simple: Cache API + 1-hour TTL = fresh HTML without overhead.

Real measurements

Setup in an e-commerce:

Before (no edge cache):

  • p50: 350ms
  • p95: 620ms
  • p99: 850ms
  • Requests/second: 8

After (KV + R2 + Cache API):

  • p50: 45ms
  • p95: 85ms
  • p99: 120ms
  • Requests/second: 200 (8x more throughput)

Difference isn’t just latency. The number of requests your server handles increased dramatically. Before you needed 8 servers for horizontal scaling. Now one was enough.

Real costs

Cloudflare Workers:

WhatFree tierPaidPrice
Requests/month100k freeUnlimited$0.50 per 1M
KV Reads100k freeUnlimited$0.50 per 1M
KV Writes1k freeUnlimited$5 per 1M
KV Storage1GB freeUnlimited$0.50 per GB/month
R2 Egress10GB freeUnlimited$0.015 per GB
Cache APIFreeFreeZero additional cost

For 50k requests/day (1.5M/month):

  • Cache hits (KV reads): 1M = $0.50
  • Cache misses (KV writes): 2k = $0.01
  • Total: $0.51/month

Your origin server saves much more than that in bandwidth.

TTL and invalidation

Common question: how long to cache?

Data that changes rarely (category list, product info): 1-24 hours Data that changes frequently (price, stock): 5-30 minutes Real-time data (user cart): don’t cache, or use session

Manual invalidation via Workers:

// Route /api/admin/invalidate-cache?key=products
if (url.pathname === "/api/admin/invalidate-cache") {
  const key = url.searchParams.get("key");
  
  // Validate token (don't do this for anyone)
  const token = request.headers.get("X-Admin-Token");
  if (token !== env.ADMIN_TOKEN) {
    return new Response("Unauthorized", { status: 401 });
  }
  
  // Delete from KV
  await env.KV.delete(key);
  
  // Delete from Cache API too
  const cacheKey = new Request(`https://your-domain.com/${key}`, { method: "GET" });
  await caches.default.delete(cacheKey);
  
  return new Response(JSON.stringify({ deleted: key }), {
    headers: { "Content-Type": "application/json" }
  });
}

Now you can invalidate manually when needed.

Implementation checklist

  • Decide which cache to use (KV, R2, or Cache API)
  • Create Cloudflare Workers project
  • Configure KV or R2 binding in wrangler.toml
  • Implement cache logic in fetch handler
  • Add X-Cache headers (HIT/MISS) for debugging
  • Test with multiple requests (verify hits)
  • Monitor Cloudflare dashboard for 48h
  • Adjust TTL based on traffic patterns
  • Implement manual invalidation if needed
  • Document cache strategy for your team

Final tip

Edge caching isn’t optional. It’s mandatory if you want modern-level application performance.

Start with Cache API (free, simple). If you need more control, move to KV. If you’re serving images or large files, R2 is your friend.

Result? 50ms instead of 500ms. Everything you build after that feels the difference.

Want more on edge? Read Edge computing in practice, complete Cloudflare infrastructure, and Core Web Vitals 2026.

Read also: Cloudflare as complete infrastructure | Edge computing in practice | Core Web Vitals in 2026

Next step

Need a dev who truly delivers?

Whether it's a one-time project, team reinforcement, or a long-term partnership. Let's talk.

Chat on WhatsApp

I reply within 2 hours during business hours.