Overview
Configure AI providers (OpenAI, Anthropic, Groq, Workers AI) for your Conductor workflows. Learn about API keys, routing modes, model selection, and AI Gateway integration.Supported Providers
Workers AI (Cloudflare)
Edge-native AI models with sub-50ms cold starts. Configuration:@cf/meta/llama-3.1-8b-instruct- Fast general purpose@cf/mistral/mistral-7b-instruct-v0.1- Efficient reasoning@cf/openai/whisper- Speech recognition@cf/stabilityai/stable-diffusion-xl-base-1.0- Image generation
- No API keys required
- Sub-50ms cold starts
- Included in Workers plan
- No external API calls
OpenAI
GPT-4o, GPT-4o-mini, and other OpenAI models. Add API Key:gpt-4o- Most capable, 15 per 1M tokensgpt-4o-mini- Fast and cheap, 0.60 per 1M tokensgpt-4-turbo- Previous flagshipgpt-3.5-turbo- Legacy, use gpt-4o-mini instead
Anthropic (Claude)
Claude 3.5 Sonnet and other Anthropic models. Add API Key:claude-3-5-sonnet-20241022- Most capable, 15 per 1M tokensclaude-3-opus-20240229- Previous flagshipclaude-3-haiku-20240307- Fast and cheap
Groq
Ultra-fast inference with LLaMA and other models. Add API Key:llama-3.1-70b-versatile- Fast, capablellama-3.1-8b-instant- Fastestmixtral-8x7b-32768- Long context
- Extremely fast inference (< 100ms)
- Generous free tier
- Long context windows
AI Gateway
Cloudflare AI Gateway provides caching, analytics, and rate limiting for AI requests.Setup
-
Create Gateway in Cloudflare Dashboard:
- Go to AI → AI Gateway
- Click “Create Gateway”
- Copy Gateway ID
- Configure Routing:
Routing Modes
1. Direct (No Gateway)
- Requests go directly to OpenAI/Anthropic/etc
- No caching
- No analytics
- Lowest latency
2. Cloudflare Gateway
- Requests routed through AI Gateway
- Persistent caching (identical requests cached)
- Analytics dashboard
- Slightly higher latency (~50ms)
3. Cloudflare (Workers AI)
- Uses Cloudflare Workers AI
- No external API calls
- Sub-50ms cold starts
- No API keys needed
Cache Configuration
Automatic Caching:- Default: 30 days
- Can configure in AI Gateway dashboard
- Model name
- Prompt text
- Temperature
- Max tokens
- Other parameters
Model Selection
By Task Complexity
By Latency Requirements
By Cost
Environment-Specific Configuration
Multiple Environments
Dynamic Model Selection
Cost Optimization
1. Cache Aggressively
2. Use Cheaper Models
3. Reduce Token Usage
4. Batch Requests
Advanced Configuration
Custom Headers
Timeouts
Retry Logic
Monitoring
Track Token Usage
Log AI Calls
AI Gateway Analytics
View in Cloudflare Dashboard:- Request count
- Cache hit rate
- Token usage
- Cost tracking
- Error rates
Best Practices
- Use AI Gateway - Cache and analytics
- Start with cheap models - Escalate if needed
- Lower temperature for caching - Better cache hits
- Batch when possible - Fewer API calls
- Monitor costs - Track token usage
- Rotate API keys - Security best practice
- Use environment-specific keys - Separate dev/prod
- Implement rate limiting - Prevent abuse
- Add retry logic - Handle transient failures
- Validate API keys on start - Fail fast

