AI Model Catalog - Ensemble

Overview

Conductor supports multiple AI models across different providers. Choose the right model based on your task complexity, latency requirements, and cost constraints.

Anthropic (Claude)

Claude Sonnet 3.5

Model: claude-3-5-sonnet-20241022 Best for: Most tasks, best balance of intelligence and speed Capabilities:

Advanced reasoning and analysis
Code generation and review
Long-form content creation
Multi-step problem solving
Fast response times

Pricing:

3/MTok input,

15/MTok output

provider: anthropic
model: claude-3-5-sonnet-20241022

Claude Opus 3

Model: claude-3-opus-20240229 Best for: Complex tasks requiring highest intelligence Capabilities:

Highest reasoning ability
Complex analysis
Expert-level tasks
Research and synthesis

Pricing:

15/MTok input,

75/MTok output

provider: anthropic
model: claude-3-opus-20240229

Claude Sonnet 3

Model: claude-3-sonnet-20240229 Best for: Balanced performance and cost Capabilities:

Strong reasoning
General purpose tasks
Good speed/quality balance

Pricing:

3/MTok input,

15/MTok output

provider: anthropic
model: claude-3-sonnet-20240229

Claude Haiku 3

Model: claude-3-haiku-20240307 Best for: Fast, simple tasks Capabilities:

Fastest responses
Simple classifications
Quick summaries
Low-cost operations

Pricing:

0.25/MTok input,

1.25/MTok output

provider: anthropic
model: claude-3-haiku-20240307

OpenAI (GPT)

GPT-4 Turbo

Model: gpt-4-turbo-preview Best for: Latest GPT-4 capabilities Capabilities:

Advanced reasoning
Code generation
Complex problem solving
Large context window (128K)

Pricing:

10/MTok input,

30/MTok output

provider: openai
model: gpt-4-turbo-preview

GPT-4

Model: gpt-4 Best for: Production-ready, most capable Capabilities:

Highest intelligence
Reliable and stable
Proven performance

Pricing:

30/MTok input,

60/MTok output

provider: openai
model: gpt-4

GPT-3.5 Turbo

Model: gpt-3.5-turbo Best for: Fast, cost-effective Capabilities:

Quick responses
Simple tasks
High throughput
Low cost

Pricing:

0.50/MTok input,

1.50/MTok output

provider: openai
model: gpt-3.5-turbo

Cloudflare Workers AI

Llama 3.1 (8B)

Model: @cf/meta/llama-3.1-8b-instruct Best for: Free, edge-optimized, fast Capabilities:

No API key required
Runs on Cloudflare’s edge
Good for testing
Free tier available

Pricing: Free tier, then $0.125/1M neurons

provider: cloudflare
model: '@cf/meta/llama-3.1-8b-instruct'

Llama 2 (7B)

Model: @cf/meta/llama-2-7b-chat-int8 Best for: Fast, small model Capabilities:

Faster responses
Lower resource usage
Simple tasks

Pricing: Free tier, then $0.125/1M neurons

provider: cloudflare
model: '@cf/meta/llama-2-7b-chat-int8'

Mistral 7B

Model: @cf/mistral/mistral-7b-instruct-v0.1 Best for: Alternative open model Capabilities:

Different approach
Good for variety
European model

Pricing: Free tier, then $0.125/1M neurons

provider: cloudflare
model: '@cf/mistral/mistral-7b-instruct-v0.1'

Model Selection Guide

By Task Type

Code Generation

Recommended: Claude Sonnet 3.5, GPT-4

provider: anthropic
model: claude-3-5-sonnet-20241022
temperature: 0.2  # Lower for more deterministic code

Creative Writing

Recommended: Claude Sonnet 3.5, Claude Opus 3

provider: anthropic
model: claude-3-5-sonnet-20241022
temperature: 0.8  # Higher for creativity

Classification/Tagging

Recommended: Claude Haiku 3, GPT-3.5 Turbo, Cloudflare Llama

provider: cloudflare
model: '@cf/meta/llama-3.1-8b-instruct'
temperature: 0.1  # Very deterministic

Analysis/Research

Recommended: Claude 3 Opus, GPT-4

provider: anthropic
model: claude-3-opus-20240229
temperature: 0.5

Summarization

Recommended: Claude Sonnet 3.5, Claude Haiku 3

provider: anthropic
model: claude-3-haiku-20240307
temperature: 0.3

By Latency Requirements

Real-Time (under 1s)

Claude Haiku 3
GPT-3.5 Turbo
Cloudflare Llama 3.1

Standard (1-3 seconds)

Claude Sonnet 3.5
GPT-4 Turbo

Complex (3-10 seconds)

Claude Opus 3
GPT-4

By Cost Constraints

Free Tier

Cloudflare Workers AI models

Low Cost (under $1 per 1M tokens)

Claude Haiku 3: $0.25-$ 1.25/MTok
GPT-3.5 Turbo: $0.50-$ 1.50/MTok

Medium Cost ( $1-$ 10 per 1M tokens)

Claude Sonnet 3.5: $3-$ 15/MTok
Claude Sonnet 3: $3-$ 15/MTok
GPT-4 Turbo: $10-$ 30/MTok

High Cost ($10+ per 1M tokens)

Claude Opus 3: $15-$ 75/MTok
GPT-4: $30-$ 60/MTok

Testing Strategy

Use cheaper models for development:

# Development
- member: test-generation
  type: Think
  config:
    provider: cloudflare  # Free
    model: '@cf/meta/llama-3.1-8b-instruct'
  input:
    prompt: ${input.prompt}

Switch to production models:

# Production
- member: production-generation
  type: Think
  config:
    provider: ${env.AI_PROVIDER || 'anthropic'}
    model: ${env.AI_MODEL || 'claude-3-5-sonnet-20241022'}
  input:
    prompt: ${input.prompt}

Model Comparison

Model	Provider	Speed	Quality	Cost	Best For
Claude Sonnet 3.5	Anthropic	Fast	Excellent	Medium	Most tasks
Claude Opus 3	Anthropic	Medium	Best	High	Complex analysis
Claude Haiku 3	Anthropic	Fastest	Good	Low	Simple tasks
GPT-4 Turbo	OpenAI	Medium	Excellent	Medium-High	Latest features
GPT-4	OpenAI	Medium	Excellent	High	Production
GPT-3.5 Turbo	OpenAI	Fast	Good	Low	High volume
Llama 3.1 8B	Cloudflare	Fast	Good	Free	Testing

Best Practices

Start with free models - Test with Cloudflare first
Match model to task - Don’t over-spec for simple tasks
Monitor costs - Track token usage
Use temperature wisely - Lower for deterministic, higher for creative
Test locally - Validate prompts before production
Implement fallbacks - Have backup models
Cache results - Avoid redundant calls
Set max tokens - Control response length
Version your prompts - Track what works
Measure quality - A/B test different models

AI Overview

AI provider system overview

Routing

Provider routing strategies

Think Member

Think member API

Cost Optimization

Optimize AI costs

Conductor API

Core Classes

Member Types API

Built-In Members API

SDK API

Testing API

Durable Objects API

AI Providers API

HTTP API

​Overview

​Anthropic (Claude)

​Claude Sonnet 3.5

​Claude Opus 3

​Claude Sonnet 3

​Claude Haiku 3

​OpenAI (GPT)

​GPT-4 Turbo

​GPT-4

​GPT-3.5 Turbo

​Cloudflare Workers AI

​Llama 3.1 (8B)

​Llama 2 (7B)

​Mistral 7B

​Model Selection Guide

​By Task Type

​Code Generation

​Creative Writing

​Classification/Tagging

​Analysis/Research

​Summarization

​By Latency Requirements

​Real-Time (under 1s)

​Standard (1-3 seconds)

​Complex (3-10 seconds)

​By Cost Constraints

​Free Tier

​Low Cost (under $1 per 1M tokens)

​Medium Cost (1−1-1−10 per 1M tokens)

​High Cost ($10+ per 1M tokens)

​Testing Strategy

​Model Comparison

​Best Practices

​Related Documentation

AI Overview

Routing

Think Member

Cost Optimization

Overview

Anthropic (Claude)

Claude Sonnet 3.5

Claude Opus 3

Claude Sonnet 3

Claude Haiku 3

OpenAI (GPT)

GPT-4 Turbo

GPT-4

GPT-3.5 Turbo

Cloudflare Workers AI

Llama 3.1 (8B)

Llama 2 (7B)

Mistral 7B

Model Selection Guide

By Task Type

Code Generation

Creative Writing

Classification/Tagging

Analysis/Research

Summarization

By Latency Requirements

Real-Time (under 1s)

Standard (1-3 seconds)

Complex (3-10 seconds)

By Cost Constraints

Free Tier

Low Cost (under $1 per 1M tokens)

Medium Cost ( $1-$ 10 per 1M tokens)

High Cost ($10+ per 1M tokens)

Testing Strategy

Model Comparison

Best Practices

Related Documentation