Provider Routing

Overview

Provider routing enables dynamic selection of AI providers based on task requirements, availability, cost constraints, or performance metrics. Implement fallback strategies, load balancing, and intelligent routing to optimize your AI workflows.

Routing Strategies

1. Static Routing

Hardcode provider selection:

- member: generate-content
  type: Think
  config:
    provider: anthropic
    model: claude-3-5-sonnet-20241022
  input:
    prompt: ${input.prompt}

Use when:

Single provider meets all needs
Simplicity is priority
No fallback required

2. Environment-Based Routing

Select provider via environment variables:

- member: generate-content
  type: Think
  config:
    provider: ${env.AI_PROVIDER}
    model: ${env.AI_MODEL}
  input:
    prompt: ${input.prompt}

# Development
AI_PROVIDER=cloudflare
AI_MODEL=@cf/meta/llama-3.1-8b-instruct

# Production
AI_PROVIDER=anthropic
AI_MODEL=claude-3-5-sonnet-20241022

Use when:

Different providers per environment
Easy configuration changes
No code changes needed

3. Task-Based Routing

Route based on task characteristics:

- member: route-by-task
  type: Function
  config:
    handler: |-
      (input) => {
        const { taskType, complexity } = input;
        
        if (taskType === 'classification') {
          return {
            provider: 'cloudflare',
            model: '@cf/meta/llama-3.1-8b-instruct'
          };
        } else if (complexity === 'high') {
          return {
            provider: 'anthropic',
            model: 'claude-3-opus-20240229'
          };
        } else {
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022'
          };
        }
      }

- member: execute-with-route
  type: Think
  config:
    provider: ${route-by-task.output.provider}
    model: ${route-by-task.output.model}
  input:
    prompt: ${input.prompt}

Use when:

Multiple task types
Cost optimization needed
Performance varies by task

4. Fallback Routing

Try multiple providers in sequence:

flow:
  # Try primary provider
  - member: try-anthropic
    type: Think
    config:
      provider: anthropic
      model: claude-3-5-sonnet-20241022
    input:
      prompt: ${input.prompt}
  
  # Fallback to OpenAI if Anthropic fails
  - member: fallback-openai
    condition: ${try-anthropic.error}
    type: Think
    config:
      provider: openai
      model: gpt-4-turbo-preview
    input:
      prompt: ${input.prompt}
  
  # Final fallback to Cloudflare
  - member: fallback-cloudflare
    condition: ${fallback-openai.error}
    type: Think
    config:
      provider: cloudflare
      model: '@cf/meta/llama-3.1-8b-instruct'
    input:
      prompt: ${input.prompt}
  
  # Combine results
  - member: get-result
    type: Function
    config:
      handler: |-
        (input) => ({
          result: input.tryAnthropic?.output || 
                  input.fallbackOpenai?.output || 
                  input.fallbackCloudflare?.output,
          provider: input.tryAnthropic?.output ? 'anthropic' :
                   input.fallbackOpenai?.output ? 'openai' : 'cloudflare'
        })

Use when:

High availability required
Provider outages possible
Redundancy needed

5. Cost-Based Routing

Route based on budget constraints:

- member: check-budget
  type: Data
  config:
    type: kv
    operation: get
    key: 'monthly_ai_spend'

- member: route-by-cost
  type: Function
  config:
    handler: |-
      (input) => {
        const spent = input.checkBudget.value || 0;
        const budget = 1000; // $1000/month
        
        if (spent < budget * 0.8) {
          // Under 80% - use best model
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022'
          };
        } else if (spent < budget * 0.95) {
          // 80-95% - use cheaper model
          return {
            provider: 'anthropic',
            model: 'claude-3-haiku-20240307'
          };
        } else {
          // Near budget - use free model
          return {
            provider: 'cloudflare',
            model: '@cf/meta/llama-3.1-8b-instruct'
          };
        }
      }

- member: generate-with-budget
  type: Think
  config:
    provider: ${route-by-cost.output.provider}
    model: ${route-by-cost.output.model}
  input:
    prompt: ${input.prompt}

Use when:

Budget constraints exist
Cost control is priority
Usage varies monthly

6. Load Balancing

Distribute requests across providers:

- member: select-provider
  type: Function
  config:
    handler: |-
      (input) => {
        const providers = [
          { provider: 'anthropic', model: 'claude-3-5-sonnet-20241022', weight: 0.5 },
          { provider: 'openai', model: 'gpt-4-turbo-preview', weight: 0.3 },
          { provider: 'cloudflare', model: '@cf/meta/llama-3.1-8b-instruct', weight: 0.2 }
        ];
        
        const random = Math.random();
        let cumulative = 0;
        
        for (const p of providers) {
          cumulative += p.weight;
          if (random <= cumulative) {
            return p;
          }
        }
        
        return providers[0];
      }

- member: execute-balanced
  type: Think
  config:
    provider: ${select-provider.output.provider}
    model: ${select-provider.output.model}
  input:
    prompt: ${input.prompt}

Use when:

High volume requests
Multiple providers available
Rate limits are concern

7. Performance-Based Routing

Route based on latency requirements:

- member: route-by-latency
  type: Function
  config:
    handler: |-
      (input) => {
        const { maxLatency } = input;
        
        if (maxLatency < 1000) {
          // Fast: < 1 second
          return {
            provider: 'anthropic',
            model: 'claude-3-haiku-20240307'
          };
        } else if (maxLatency < 3000) {
          // Standard: 1-3 seconds
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022'
          };
        } else {
          // Complex: 3+ seconds OK
          return {
            provider: 'anthropic',
            model: 'claude-3-opus-20240229'
          };
        }
      }

- member: execute-with-latency
  type: Think
  config:
    provider: ${route-by-latency.output.provider}
    model: ${route-by-latency.output.model}
  input:
    prompt: ${input.prompt}

Use when:

SLAs define latency
Real-time vs batch processing
User-facing vs background

8. Feature-Based Routing

Route based on required capabilities:

- member: route-by-features
  type: Function
  config:
    handler: |-
      (input) => {
        const { needsVision, needsCode, needsLongContext } = input;
        
        if (needsVision) {
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022' // Supports vision
          };
        } else if (needsLongContext) {
          return {
            provider: 'openai',
            model: 'gpt-4-turbo-preview' // 128K context
          };
        } else if (needsCode) {
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022' // Best for code
          };
        } else {
          return {
            provider: 'cloudflare',
            model: '@cf/meta/llama-3.1-8b-instruct' // General
          };
        }
      }

Use when:

Different capabilities needed
Model-specific features required
Capabilities vary by provider

Advanced Patterns

Circuit Breaker

Temporarily disable failing providers:

- member: check-circuit
  type: Data
  config:
    type: kv
    operation: get
    key: 'circuit:anthropic'

- member: select-provider
  type: Function
  config:
    handler: |-
      (input) => {
        const circuit = input.checkCircuit.value;
        
        if (circuit?.open && Date.now() < circuit.resetAt) {
          // Circuit open - use alternative
          return { provider: 'openai', model: 'gpt-4-turbo-preview' };
        }
        
        // Circuit closed or reset - use primary
        return { provider: 'anthropic', model: 'claude-3-5-sonnet-20241022' };
      }

- member: execute-request
  type: Think
  config:
    provider: ${select-provider.output.provider}
    model: ${select-provider.output.model}
  input:
    prompt: ${input.prompt}

- member: update-circuit
  condition: ${execute-request.error}
  type: Data
  config:
    type: kv
    operation: put
    key: 'circuit:${select-provider.output.provider}'
    value:
      open: true
      resetAt: ${Date.now() + 60000}  # 1 minute

Retry with Backoff

Retry failed requests with different providers:

- member: execute-with-retry
  type: Function
  config:
    handler: |-
      async (input, context) => {
        const providers = [
          { provider: 'anthropic', model: 'claude-3-5-sonnet-20241022' },
          { provider: 'openai', model: 'gpt-4-turbo-preview' },
          { provider: 'cloudflare', model: '@cf/meta/llama-3.1-8b-instruct' }
        ];
        
        for (let i = 0; i < providers.length; i++) {
          try {
            const result = await executeAI(providers[i], input.prompt, context);
            return { result, provider: providers[i].provider, attempts: i + 1 };
          } catch (error) {
            if (i === providers.length - 1) throw error;
            await new Promise(r => setTimeout(r, 1000 * Math.pow(2, i)));
          }
        }
      }

A/B Testing

Test different providers/models:

- member: ab-test-provider
  type: Function
  config:
    handler: |-
      (input) => {
        const userId = input.userId;
        const hash = userId.split('').reduce((a, b) => a + b.charCodeAt(0), 0);
        const bucket = hash % 100;
        
        if (bucket < 50) {
          // Group A - Claude
          return {
            provider: 'anthropic',
            model: 'claude-3-5-sonnet-20241022',
            group: 'A'
          };
        } else {
          // Group B - GPT-4
          return {
            provider: 'openai',
            model: 'gpt-4-turbo-preview',
            group: 'B'
          };
        }
      }

- member: execute-ab-test
  type: Think
  config:
    provider: ${ab-test-provider.output.provider}
    model: ${ab-test-provider.output.model}
  input:
    prompt: ${input.prompt}

- member: log-ab-result
  type: Data
  config:
    type: d1
    operation: execute
    query: |-
      INSERT INTO ab_tests (user_id, group, provider, model, latency, tokens)
      VALUES (?, ?, ?, ?, ?, ?)
    params:
      - ${input.userId}
      - ${ab-test-provider.output.group}
      - ${ab-test-provider.output.provider}
      - ${ab-test-provider.output.model}
      - ${execute-ab-test.duration}
      - ${execute-ab-test.output.tokensUsed}

Monitoring and Metrics

Track routing effectiveness:

- member: log-provider-usage
  type: Data
  config:
    type: d1
    operation: execute
    query: |-
      INSERT INTO provider_metrics (
        provider, model, latency, tokens, cost, success, timestamp
      ) VALUES (?, ?, ?, ?, ?, ?, ?)
    params:
      - ${execute-request.provider}
      - ${execute-request.model}
      - ${execute-request.duration}
      - ${execute-request.tokensUsed}
      - ${execute-request.estimatedCost}
      - ${!execute-request.error}
      - ${Date.now()}

Best Practices

Start simple - Use static routing initially
Add fallbacks - Implement for production
Monitor costs - Track spend per provider
Test routing logic - Validate with different scenarios
Use circuit breakers - Handle provider outages
Implement timeouts - Prevent hanging requests
Log routing decisions - Debug and optimize
A/B test providers - Compare quality and cost
Cache routing decisions - Avoid redundant logic
Document strategy - Explain routing choices

AI Overview

AI provider system overview

Model Catalog

Available models

Provider Registry

Managing providers

Error Handling

Error handling patterns

Conductor API

Core Classes

Member Types API

Built-In Members API

SDK API

Testing API

Durable Objects API

AI Providers API

HTTP API

Overview

Routing Strategies

1. Static Routing

2. Environment-Based Routing

3. Task-Based Routing

4. Fallback Routing

5. Cost-Based Routing

6. Load Balancing

7. Performance-Based Routing

8. Feature-Based Routing

Advanced Patterns

Circuit Breaker

Retry with Backoff

A/B Testing

Monitoring and Metrics

Best Practices

AI Overview

Model Catalog

Provider Registry

Error Handling

Conductor API

Core Classes

Member Types API

Built-In Members API

SDK API

Testing API

Durable Objects API

AI Providers API

HTTP API

​Overview

​Routing Strategies

​1. Static Routing

​2. Environment-Based Routing

​3. Task-Based Routing

​4. Fallback Routing

​5. Cost-Based Routing

​6. Load Balancing

​7. Performance-Based Routing

​8. Feature-Based Routing

​Advanced Patterns

​Circuit Breaker

​Retry with Backoff

​A/B Testing

​Monitoring and Metrics

​Best Practices

​Related Documentation

AI Overview

Model Catalog

Provider Registry

Error Handling

Overview

Routing Strategies

1. Static Routing

2. Environment-Based Routing

3. Task-Based Routing

4. Fallback Routing

5. Cost-Based Routing

6. Load Balancing

7. Performance-Based Routing

8. Feature-Based Routing

Advanced Patterns

Circuit Breaker

Retry with Backoff

A/B Testing

Monitoring and Metrics

Best Practices

Related Documentation