Why Caching Matters
AI and API calls are expensive. Caching the same request can save you 90%+ on costs and reduce latency from seconds to milliseconds. Conductor provides multi-layer caching out of the box.Cost Savings
Cache identical AI requests - pay once, use many times
Speed
Cached responses return in < 5ms vs 1-2 seconds for API calls
Reliability
Continue working even if AI providers have outages
Rate Limits
Reduce API calls to stay within provider rate limits
How Caching Works
Three Caching Layers
Conductor provides three complementary caching layers:1. Member-Level Cache (KV)
Conductor’s built-in cache using Cloudflare KV:- Automatic cache key generation (member name + input hash)
- TTL control per member
- Cache bypass option
- Works across all member types
2. AI Gateway Cache
Cloudflare AI Gateway provides persistent caching for AI provider calls:- Persistent cache across deployments
- Real-time analytics and monitoring
- Cache hit rate tracking
- Works with OpenAI, Anthropic, Groq, etc.
3. Browser Cache (Client-Side)
For web applications, use HTTP cache headers:Member-Level Caching
Basic Configuration
Enable caching for any member:Cache Bypass
Skip cache for specific requests:Cache Key Customization
Conductor automatically generates cache keys from:- Member name
- Input data (hashed)
Per-Member TTL
Different members can have different cache durations:AI Gateway Caching
Setup
Configure AI Gateway in wrangler.toml:Usage
Route AI calls through gateway:Benefits
Persistent Cache
Persistent Cache
Cache survives deployments and spans all Workers. Once cached, every user benefits.
Analytics
Analytics
Track cache hit rates, costs, latency in real-time dashboard.
Cost Controls
Cost Controls
Set spending limits, rate limits, and alerts via dashboard.
A/B Testing
A/B Testing
Compare different models, prompts, parameters with traffic splitting.
Cache Hit Rate Monitoring
Caching Strategies
Time-Based Invalidation
Set TTL based on data freshness requirements:Conditional Caching
Cache only when appropriate:Layered Caching
Combine multiple cache layers:Cache Warming
Pre-populate cache for common requests:Cache Invalidation
Time-Based (Automatic)
TTL expires and cache entry is removed:Manual Invalidation
Programmatically clear cache:Event-Based Invalidation
Invalidate when data changes:Webhook-Triggered Invalidation
Clear cache when external data changes:Cache Configuration Examples
API Member with Cache
Think Member with AI Gateway
Data Member with Short Cache
Function Member with Conditional Cache
Best Practices
1. Start with Longer TTLs
2. Cache Expensive Operations
3. Use AI Gateway for AI Calls
4. Monitor Cache Hit Rates
5. Consider Cache Size Limits
KV has size limits (25 MB per value):6. Use Namespace Prefixes
Organize cache keys by environment:Cost Analysis
Without Caching
With Caching
- 💰 Cost: $1.998 saved (99.9%)
- ⚡ Time: 1,993 seconds saved (99.65%)

