Documentation Index Fetch the complete documentation index at: https://docs.ensemble.ai/llms.txt
Use this file to discover all available pages before exploring further.
Your gateway to AI-powered intelligence. Use it for any task involving natural language understanding, content generation, or complex reasoning.
Basic Usage
operations :
- name : analyze
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : Analyze this text : ${input.text}
Configuration Options
Required Fields
config :
provider : string # openai, anthropic, cloudflare, groq
model : string # Model identifier
prompt : string # Prompt text with template expressions
Optional Fields
config :
temperature : 0.7 # Randomness (0.0-2.0, default: 0.7)
maxTokens : 1000 # Max output tokens (default: model-specific)
systemPrompt : string # System message for context
responseFormat : json # json or text (default: text)
topP : 1.0 # Nucleus sampling (0.0-1.0)
frequencyPenalty : 0.0 # Penalize frequent tokens (-2.0 to 2.0)
presencePenalty : 0.0 # Penalize repeated tokens (-2.0 to 2.0)
stop : [ string ] # Stop sequences
seed : number # Deterministic sampling seed
Provider Selection
OpenAI (GPT Models)
Fast, high-quality models with structured outputs.
config :
provider : openai
model : gpt-4o-mini # Recommended: fast & cheap
Available Models :
gpt-4o - Most capable, multimodal
gpt-4o-mini - Fast, cost-effective (recommended)
o1-mini - Advanced reasoning
gpt-4-turbo - Previous generation
Pricing (per 1M tokens):
gpt-4o-mini: 0.15 i n p u t / 0.15 input / 0.15 in p u t / 0.60 output
gpt-4o: 2.50 i n p u t / 2.50 input / 2.50 in p u t / 10.00 output
Anthropic (Claude Models)
Strong reasoning, long context, extended thinking.
config :
provider : anthropic
model : claude-3-5-sonnet-20241022
Available Models :
claude-3-5-sonnet-20241022 - Most capable
claude-3-5-haiku-20241022 - Fast, cost-effective
claude-3-opus-20240229 - Previous generation
Pricing (per 1M tokens):
claude-3-5-haiku: 0.80 i n p u t / 0.80 input / 0.80 in p u t / 4.00 output
claude-3-5-sonnet: 3.00 i n p u t / 3.00 input / 3.00 in p u t / 15.00 output
Cloudflare Workers AI
Edge-native models with free tier.
config :
provider : cloudflare
model : '@cf/meta/llama-3.1-8b-instruct'
Available Models :
@cf/meta/llama-3.1-8b-instruct - Fast, general purpose
@cf/meta/llama-3.1-70b-instruct - More capable
@cf/mistral/mistral-7b-instruct-v0.1 - Fast instruction following
Pricing : Free tier - 10,000 requests/day
Groq
Ultra-fast inference with LPU acceleration.
config :
provider : groq
model : llama-3.1-8b-instant
Available Models :
llama-3.1-8b-instant - Fastest (~200ms response)
llama-3.1-70b-versatile - More capable
mixtral-8x7b-32768 - Long context window
Machine Learning Models
For ML inference (embeddings, image classification, object detection, vision), use Workers AI models via the workers-ai provider.
See: Machine Learning for complete guide including:
Text embeddings (7 models)
Image classification
Object detection
Vision models
Text classification
System Prompts
Basic System Prompt
config :
systemPrompt : |
You are a helpful assistant that analyzes text sentiment.
Always respond in JSON format.
config :
systemPrompt : |
Analyze the company and respond with JSON:
{
"industry": "string",
"size": "small" | "medium" | "large",
"confidence": number (0-1),
"summary": "string"
}
Role-Based Prompts
config :
systemPrompt : |
You are an expert business analyst with 20 years of experience.
Analyze companies objectively and provide actionable insights.
Focus on:
- Financial health
- Market position
- Growth potential
- Competitive advantages
Few-Shot Prompts
config :
systemPrompt : |
Classify customer feedback sentiment.
Examples:
Input: "I love this product! Best purchase ever!"
Output: {"sentiment": "positive", "confidence": 0.95}
Input: "It's okay, nothing special."
Output: {"sentiment": "neutral", "confidence": 0.7}
Input: "Terrible quality, waste of money."
Output: {"sentiment": "negative", "confidence": 0.9}
Now classify the following:
Common Patterns
Sentiment Analysis
operations :
- name : analyze-sentiment
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.3 # Lower for consistency
maxTokens : 50
prompt : |
Analyze the sentiment of this text.
Return only: positive, negative, or neutral.
Text: ${input.text}
Sentiment:
Classification
operations :
- name : classify-intent
operation : think
config :
provider : cloudflare
model : '@cf/meta/llama-3.1-8b-instruct'
temperature : 0.2
maxTokens : 50
systemPrompt : |
Classify user intent into: question, request, complaint, or praise.
Respond with only one word.
prompt : ${input.message}
operations :
- name : extract-entities
operation : think
config :
provider : anthropic
model : claude-3-5-sonnet-20241022
temperature : 0.1
maxTokens : 500
responseFormat : json
systemPrompt : |
Extract company information from the text.
Return JSON with: name, industry, location, employees, founded.
prompt : ${input.text}
Text Summarization
operations :
- name : summarize-article
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.5
maxTokens : 200
systemPrompt : |
Summarize the article in 2-3 sentences.
Focus on key points and main ideas.
prompt : |
Article:
${input.article}
Summary:
Content Generation
operations :
- name : generate-blog-post
operation : think
config :
provider : openai
model : gpt-4o
temperature : 0.8
maxTokens : 2000
systemPrompt : |
Write an engaging blog post on the given topic.
Include:
- Attention-grabbing headline
- Introduction with hook
- 3-5 main points with examples
- Conclusion with call-to-action
prompt : |
Topic: ${input.topic}
Target audience: ${input.audience}
Tone: ${input.tone}
Question Answering (RAG)
operations :
- name : answer-question
operation : think
config :
provider : anthropic
model : claude-3-5-sonnet-20241022
temperature : 0.3
maxTokens : 500
systemPrompt : |
Answer the question based only on the provided context.
If the answer isn't in the context, say "I don't know."
Context: ${input.context}
prompt : |
Question: ${input.question}
Answer:
Translation
operations :
- name : translate
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.3
maxTokens : 1000
prompt : |
Translate this text from ${input.from} to ${input.to}:
${input.text}
Translation:
Structured Outputs
JSON Mode (OpenAI)
operations :
- name : extract-json
operation : think
config :
provider : openai
model : gpt-4o-mini
responseFormat : json
prompt : |
Extract information from this text and return as JSON:
{
"name": "person's name",
"email": "email address",
"intent": "purchase|support|inquiry"
}
Text: ${input.message}
JSON Schema (OpenAI Structured Outputs)
operations :
- name : extract-structured
operation : think
config :
provider : openai
model : gpt-4o-mini
responseFormat :
type : json_schema
json_schema :
name : company_analysis
strict : true
schema :
type : object
required : [ industry , employees , founded , summary ]
properties :
industry :
type : string
employees :
type : number
founded :
type : number
summary :
type : string
additionalProperties : false
prompt : Extract company info from : ${input.text}
Temperature Guide
Temperature controls randomness and creativity:
# Low temperature (0.0-0.3): Deterministic, focused
operations :
- name : classify
operation : think
config :
temperature : 0.2 # Consistent results
prompt : Classify : ${input.text}
# Medium temperature (0.5-0.8): Balanced creativity
- name : write-content
operation : think
config :
temperature : 0.7 # Natural language
prompt : Write about : ${input.topic}
# High temperature (1.0-2.0): Maximum creativity
- name : brainstorm
operation : think
config :
temperature : 1.5 # Diverse ideas
prompt : Brainstorm ideas for : ${input.topic}
Token Limits
Control output length and cost:
config :
# Short responses
maxTokens : 100
# Medium responses
maxTokens : 500
# Long responses
maxTokens : 2000
# Maximum (model-dependent)
maxTokens : 4000
schema :
input :
type : object
properties :
text :
type : string
required : [ text ]
Multiple Fields
schema :
input :
type : object
properties :
companyName :
type : string
website :
type : string
industry :
type : string
required : [ companyName ]
Use in prompt:
config :
systemPrompt : |
Analyze ${input.companyName} in the ${input.industry} industry.
Website: ${input.website}
Messages Array (Conversations)
schema :
input :
type : object
properties :
messages :
type : array
items :
type : object
properties :
role :
type : string
enum : [ user , assistant , system ]
content :
type : string
For multi-turn conversations:
// Pass conversation history
input : {
messages : [
{ role: "user" , content: "What's the capital of France?" },
{ role: "assistant" , content: "Paris" },
{ role: "user" , content: "What's its population?" }
]
}
Advanced Techniques
Chain of Thought
Encourage step-by-step reasoning:
operations :
- name : solve-problem
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.7
systemPrompt : |
Think through problems step by step.
Show your reasoning before giving an answer.
prompt : |
${input.question}
Let's think step by step:
Self-Consistency
Run multiple times and pick most common answer:
operations :
- name : answer-1
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.8
prompt : ${input.question}
- name : answer-2
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.8
prompt : ${input.question}
- name : answer-3
operation : think
config :
provider : openai
model : gpt-4o-mini
temperature : 0.8
prompt : ${input.question}
- name : consensus
operation : code
config :
script : scripts/pick-consensus-answer
input :
answer1 : ${answer-1.output}
answer2 : ${answer-2.output}
answer3 : ${answer-3.output}
// scripts/pick-consensus-answer.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
export default function pickConsensusAnswer ( context : AgentExecutionContext ) {
const { answer1 , answer2 , answer3 } = context . input
const answers = [ answer1 , answer2 , answer3 ]
// Count occurrences
const counts = new Map < string , number >()
answers . forEach ( answer => {
counts . set ( answer , ( counts . get ( answer ) || 0 ) + 1 )
})
// Find most common
let mostCommon = answers [ 0 ]
let maxCount = 0
counts . forEach (( count , answer ) => {
if ( count > maxCount ) {
maxCount = count
mostCommon = answer
}
})
return { answer: mostCommon }
}
Multi-Turn Conversations
Build context across operations:
operations :
- name : first-response
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : ${input.user_message}
- name : follow-up
operation : think
config :
provider : openai
model : gpt-4o-mini
systemPrompt : |
Previous conversation:
User: ${input.user_message}
Assistant: ${first-response.output}
prompt : ${input.follow_up_message}
Cost Optimization
1. Use Cheaper Models
# Good: use mini for simple tasks
operations :
- name : classify-email
operation : think
config :
model : gpt-4o-mini # $0.15/1M tokens
# Only use expensive models when needed
- name : complex-analysis
condition : ${classify-email.output.confidence < 0.8}
operation : think
config :
model : gpt-4o # $2.50/1M tokens
2. Aggressive Caching
operations :
- name : analyze-sentiment
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : ${input.text}
cache :
ttl : 86400 # 24 hours
key : sentiment-${input.text}
3. Lower Temperature for Cache Hits
config :
temperature : 0.1 # More deterministic = better cache hit rate
4. Limit Token Usage
config :
maxTokens : 100 # Only what you need
systemPrompt : "Be concise. Maximum 50 words."
5. Use Workers AI Free Tier
config :
provider : cloudflare # Free: 10k requests/day
model : '@cf/meta/llama-3.1-8b-instruct'
6. Track AI Costs with Telemetry
Emit token usage to Analytics Engine for cost tracking and billing:
agents :
- name : generate
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : ${input.text}
- name : track-costs
operation : telemetry
input :
blobs :
- ai_inference
- openai
- gpt-4o-mini
doubles :
- ${generate.output.usage.inputTokens}
- ${generate.output.usage.outputTokens}
indexes :
- ${input.customerId}
See telemetry operation for querying examples.
Use Workers AI for Speed
# Sub-50ms cold start, sub-10ms warm
operations :
- name : fast-classify
operation : think
config :
provider : cloudflare
model : '@cf/meta/llama-3.1-8b-instruct'
Use Groq for Fast Inference
# ~200ms response time
operations :
- name : quick-response
operation : think
config :
provider : groq
model : llama-3.1-70b-versatile
Parallel Operations
Run multiple AI operations in parallel:
operations :
- name : sentiment
operation : think
config :
model : gpt-4o-mini
prompt : Sentiment : ${input.text}
- name : entities
operation : think
config :
model : gpt-4o-mini
prompt : Extract entities : ${input.text}
- name : summary
operation : think
config :
model : gpt-4o-mini
prompt : Summarize : ${input.text}
# All three run in parallel automatically
Error Handling
Retry on Failure
operations :
- name : generate-content
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : ${input.prompt}
retry :
maxAttempts : 3
backoff : exponential
initialDelay : 1000
Fallback Operation
operations :
- name : primary-ai
operation : think
config :
provider : openai
model : gpt-4o
- name : fallback-ai
condition : ${!primary-ai.output}
operation : think
config :
provider : cloudflare
model : '@cf/meta/llama-3.1-8b-instruct'
Handle Rate Limits
operations :
- name : ai-operation
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : ${input.text}
retry :
maxAttempts : 5
backoff : exponential
initialDelay : 2000
Output Parsing
Think operations support schema-aware output mapping - when you define an output schema,
the AI response is automatically mapped to your schema field names, making outputs intuitive
to use in ensembles.
Schema-Aware Output (Recommended)
Define your output schema and access results using your field names:
# Agent definition (agents/greeter/agent.yaml)
name : greeter
operation : think
config :
provider : cloudflare
model : '@cf/meta/llama-3.1-8b-instruct'
prompt : |
Greet the user warmly. Keep it under 20 words.
Name: {{input.name}}
schema :
input :
name : string
output :
greeting : string # Define your output field name
# In ensembles, access using your schema field name:
# ${greeter.output.greeting} ✅ Uses your schema field!
# Ensemble using the greeter agent
name : welcome-flow
trigger :
- type : http
path : /welcome
methods : [ POST ]
public : true
flow :
- agent : greeter
input :
name : ${input.userName}
output :
message : ${greeter.output.greeting} # Intuitive access!
model : ${greeter.output._meta.model} # Metadata via _meta
How it works:
Schema defines output: { greeting: string } → AI response maps to greeting field
If AI returns valid JSON, all fields are spread to top level
Metadata (model, provider, tokensUsed) available via _meta
Text Output (Simple)
For simple text responses without schema:
operations :
- name : generate-text
operation : think
config :
provider : openai
model : gpt-4o-mini
prompt : Write a tagline for : ${input.product}
# Access output (no schema = content field)
outputs :
tagline : ${generate-text.output.content}
JSON Output (Structured)
When the AI returns JSON, fields are automatically available at top level:
# Agent that returns structured JSON
name : analyzer
operation : think
config :
provider : openai
model : gpt-4o-mini
responseFormat : json
prompt : |
Analyze the sentiment and extract keywords as JSON:
{"sentiment": "positive|negative|neutral", "keywords": ["word1", "word2"]}
Text: {{input.text}}
schema :
output :
sentiment : string
keywords : array
# Ensemble accessing structured output
flow :
- agent : analyzer
input :
text : ${input.text}
output :
sentiment : ${analyzer.output.sentiment} # Direct access!
keywords : ${analyzer.output.keywords} # Arrays work too
model : ${analyzer.output._meta.model} # Metadata
All think operations include metadata in the _meta field:
# Available metadata fields:
${agent.output._meta.model} # Model used (e.g., "gpt-4o-mini")
${agent.output._meta.provider} # Provider (e.g., "openai")
${agent.output._meta.tokensUsed} # Total tokens consumed
Testing
Test AI operations with mocks:
import { TestConductor } from '@ensemble/conductor/testing' ;
const conductor = await TestConductor . create ({
projectPath: './conductor' ,
mocks: {
ai: {
'analyze-sentiment' : {
output: 'positive'
},
'extract-entities' : {
output: '{"people": ["Alice"], "orgs": ["Anthropic"]}'
}
}
}
});
const result = await conductor . executeAgent ( 'my-agent' , {
text: 'I love this product!'
});
expect ( result . output ). toBeDefined ();
Best Practices
1. Choose the Right Model
# Simple tasks: Use fast, cheap models
model : gpt-4o-mini
# Complex reasoning: Use advanced models
model : gpt-4o
2. Set Appropriate Temperature
# Deterministic tasks (classification, extraction)
temperature : 0.1-0.3
# Creative tasks (content generation)
temperature : 0.7-1.0
3. Use System Prompts
systemPrompt : |
You are an expert in ${domain}.
Follow these rules:
- Be concise
- Provide examples
- Cite sources when possible
4. Provide Examples (Few-Shot)
prompt : |
Examples:
Input: "..." Output: "..."
Input: "..." Output: "..."
Now classify:
Input: ${input.text}
5. Request Structured Output
responseFormat : json
prompt : |
Return as JSON: {"field": "value"}
${input.text}
6. Cache Expensive Operations
cache :
ttl : 3600
key : ${config.model}-${input.text}
7. Set Token Limits
maxTokens : 500 # Prevent runaway costs
8. Handle Errors with Retry
retry :
maxAttempts : 3
backoff : exponential
Common Issues
Issue: Inconsistent Outputs
Solution : Lower temperature
temperature : 0.0-0.3 # More deterministic
Issue: Truncated Responses
Solution : Increase max tokens
maxTokens : 2000 # Longer responses
Issue: High Costs
Solution : Use cheaper models + caching
model : gpt-4o-mini # 15x cheaper
cache :
ttl : 86400 # Cache for 24h
Issue: Slow Responses
Solution : Use faster providers
provider : groq # Ultra-fast inference
# or
provider : cloudflare # Edge-native
Issue: Rate Limits
Solution : Add retry logic + backoff
retry :
maxAttempts : 5
backoff : exponential
initialDelay : 1000
Next Steps
code Operation JavaScript execution
storage Operation Data persistence
Starter Kit Agents using think operation
Playbooks Common agent patterns