Documentation Index Fetch the complete documentation index at: https://docs.ensemble.ai/llms.txt
Use this file to discover all available pages before exploring further.
Machine learning models run at the edge via Cloudflare Workers AI. Access them through Conductor’s think operation using the workers-ai provider.
Overview
Workers AI provides serverless GPU inference for ML models:
Free tier: 10,000 requests/day
Latency: Runs at edge, near your users
Provider: Use workers-ai in think operation
Binding: Requires [ai] binding in wrangler.toml
Model Categories:
Text Embeddings (7 models)
Image Classification (1 model)
Object Detection (1 model)
Image-to-Text (2 models)
Vision Models (2 multimodal LLMs)
Text Classification (2 models)
Configuration
wrangler.toml
Environment Variable
Set CONDUCTOR_AI_PROVIDER=workers-ai or configure per-agent.
Text Embeddings
Convert text into vector representations for semantic search, RAG, clustering, and similarity tasks.
Available Models
English Models (BGE):
@cf/baai/bge-small-en-v1.5 - 384 dimensions, fastest
@cf/baai/bge-base-en-v1.5 - 768 dimensions, balanced
@cf/baai/bge-large-en-v1.5 - 1024 dimensions, most accurate
Multilingual:
@cf/baai/bge-m3 - 1024 dims, 100+ languages, multi-vector retrieval
Specialized:
@cf/google/embeddinggemma-300m - From Gemma 3, 100+ languages
@cf/pfnet/plamo-embedding-1b - Japanese text
@cf/qwen/qwen3-embedding-0.6b - Chinese/multilingual
Generate Embeddings
agents :
- name : embed-text
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-base-en-v1.5'
prompt : ${input.text}
Store in Vectorize
agents :
- name : generate-embedding
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-base-en-v1.5'
prompt : ${input.document}
- name : store-vector
operation : storage
config :
action : vectorize-insert
index : documents
vectors :
- id : ${input.id}
values : ${generate-embedding.output}
metadata :
text : ${input.document}
Semantic Search
agents :
- name : embed-query
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-base-en-v1.5'
prompt : ${input.query}
- name : search
operation : storage
config :
action : vectorize-query
index : documents
vector : ${embed-query.output}
topK : 5
Choosing an Embedding Model
Use bge-small-en-v1.5 when:
Speed is critical
Low latency required
English-only content
Cost-sensitive (fewer dimensions = cheaper storage)
Use bge-base-en-v1.5 when:
Balanced performance needed
General-purpose embeddings
English content with some multilingual
Use bge-large-en-v1.5 when:
Maximum accuracy required
Complex semantic understanding
Willing to trade speed for quality
Use bge-m3 when:
Multilingual content (100+ languages)
Need multi-vector retrieval
Cross-language search
Image Classification
Classify images into categories using ResNet-50.
Model
@cf/microsoft/resnet-50 - 1000 ImageNet classes
Classify Image
agents :
- name : classify-image
operation : think
config :
provider : workers-ai
model : '@cf/microsoft/resnet-50'
image : ${input.image_url}
Output:
{
"predictions" : [
{ "label" : "golden retriever" , "score" : 0.92 },
{ "label" : "Labrador retriever" , "score" : 0.05 },
{ "label" : "cocker spaniel" , "score" : 0.02 }
]
}
Use Cases
Content Moderation:
agents :
- name : classify
operation : think
config :
provider : workers-ai
model : '@cf/microsoft/resnet-50'
image : ${input.uploaded_image}
- name : filter
operation : code
config :
script : scripts/content-moderation-filter
input :
predictions : ${classify.output.predictions}
// scripts/content-moderation-filter.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
export default function contentModerationFilter ( context : AgentExecutionContext ) {
const { predictions } = context . input as { predictions : Array <{ label : string }> }
const top = predictions [ 0 ]
if ( top . label . includes ( 'inappropriate' )) {
throw new Error ( 'Content violation' )
}
return { approved: true }
}
Auto-Tagging:
agents :
- name : classify
operation : think
config :
provider : workers-ai
model : '@cf/microsoft/resnet-50'
image : ${input.product_image}
- name : generate-tags
operation : code
config :
script : scripts/generate-image-tags
input :
predictions : ${classify.output.predictions}
// scripts/generate-image-tags.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
export default function generateImageTags ( context : AgentExecutionContext ) {
const { predictions } = context . input as { predictions : Array <{ label : string }> }
const tags = predictions . slice ( 0 , 5 ). map ( p => p . label )
return { tags }
}
Object Detection
Detect objects in images with bounding boxes and class labels.
Model
@cf/facebook/detr-resnet-50 - Detection Transformer
Detect Objects
agents :
- name : detect-objects
operation : think
config :
provider : workers-ai
model : '@cf/facebook/detr-resnet-50'
image : ${input.image_url}
Output:
{
"objects" : [
{
"label" : "person" ,
"score" : 0.98 ,
"box" : { "xmin" : 120 , "ymin" : 50 , "xmax" : 250 , "ymax" : 400 }
},
{
"label" : "car" ,
"score" : 0.95 ,
"box" : { "xmin" : 300 , "ymin" : 200 , "xmax" : 500 , "ymax" : 350 }
}
]
}
Use Cases
Count Objects:
agents :
- name : detect
operation : think
config :
provider : workers-ai
model : '@cf/facebook/detr-resnet-50'
image : ${input.warehouse_photo}
- name : count-inventory
operation : code
config :
script : scripts/count-inventory
input :
objects : ${detect.output.objects}
// scripts/count-inventory.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
interface DetectedObject {
label : string
score : number
}
export default function countInventory ( context : AgentExecutionContext ) {
const { objects } = context . input as { objects : DetectedObject [] }
const boxes = objects . filter ( o => o . label === 'box' && o . score > 0.8 )
return { count: boxes . length }
}
Scene Understanding:
agents :
- name : detect
operation : think
config :
provider : workers-ai
model : '@cf/facebook/detr-resnet-50'
image : ${input.scene_image}
- name : analyze-scene
operation : code
config :
script : scripts/analyze-scene
input :
objects : ${detect.output.objects}
// scripts/analyze-scene.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
interface DetectedObject {
label : string
score : number
}
export default function analyzeScene ( context : AgentExecutionContext ) {
const { objects } = context . input as { objects : DetectedObject [] }
return {
people: objects . filter ( o => o . label === 'person' ). length ,
vehicles: objects . filter ( o => [ 'car' , 'truck' , 'bus' ]. includes ( o . label )). length ,
confidence: objects . length > 0
? objects . reduce (( sum , o ) => sum + o . score , 0 ) / objects . length
: 0
}
}
Image-to-Text
Generate text descriptions or answers from images.
Models
@cf/llava-hf/llava-1.5-7b-hf - Vision Q&A and captioning
@cf/unum/uform-gen2-qwen-500m - Lightweight image-to-text
Generate Caption
agents :
- name : caption-image
operation : think
config :
provider : workers-ai
model : '@cf/llava-hf/llava-1.5-7b-hf'
prompt : "Describe this image in detail"
image : ${input.image_url}
Image Q&A
agents :
- name : answer-question
operation : think
config :
provider : workers-ai
model : '@cf/llava-hf/llava-1.5-7b-hf'
prompt : ${input.question}
image : ${input.image_url}
Example:
input :
question : "How many people are in this photo?"
image_url : "https://example.com/photo.jpg"
output : "There are 3 people visible in this photograph."
Use Cases
Accessibility:
agents :
- name : generate-alt-text
operation : think
config :
provider : workers-ai
model : '@cf/llava-hf/llava-1.5-7b-hf'
prompt : "Generate descriptive alt text for screen readers"
image : ${input.image}
Product Descriptions:
agents :
- name : describe-product
operation : think
config :
provider : workers-ai
model : '@cf/llava-hf/llava-1.5-7b-hf'
prompt : "Describe this product's features, color, and style"
image : ${input.product_photo}
Vision Models (Multimodal LLMs)
Advanced vision understanding using multimodal language models.
Models
@cf/meta/llama-3.2-11b-vision-instruct - Llama with vision
@cf/google/gemma-3-12b-it - Gemma with image support
Visual Reasoning
agents :
- name : analyze-chart
operation : think
config :
provider : workers-ai
model : '@cf/meta/llama-3.2-11b-vision-instruct'
prompt : "Extract all data points from this chart and summarize the trends"
image : ${input.chart_image}
Document OCR
agents :
- name : extract-text
operation : think
config :
provider : workers-ai
model : '@cf/meta/llama-3.2-11b-vision-instruct'
prompt : "Extract all text from this document, preserving structure"
image : ${input.document_scan}
Visual Q&A with Context
agents :
- name : visual-qa
operation : think
config :
provider : workers-ai
model : '@cf/meta/llama-3.2-11b-vision-instruct'
prompt : |
Context: ${input.context}
Question: ${input.question}
Analyze the image and answer the question using both the visual information and context.
image : ${input.image}
Use Cases
Invoice Processing:
agents :
- name : process-invoice
operation : think
config :
provider : workers-ai
model : '@cf/meta/llama-3.2-11b-vision-instruct'
prompt : |
Extract the following from this invoice:
- Invoice number
- Date
- Total amount
- Line items with quantities and prices
image : ${input.invoice_image}
- name : validate
operation : code
config :
script : scripts/validate-invoice-data
input :
rawOutput : ${process-invoice.output}
// scripts/validate-invoice-data.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
export default function validateInvoiceData ( context : AgentExecutionContext ) {
const { rawOutput } = context . input as { rawOutput : string }
const data = JSON . parse ( rawOutput )
if ( ! data . invoice_number || ! data . total ) {
throw new Error ( 'Missing required fields' )
}
return data
}
Chart Analysis:
agents :
- name : analyze-metrics
operation : think
config :
provider : workers-ai
model : '@cf/google/gemma-3-12b-it'
prompt : "Analyze this metrics dashboard. What are the key trends and anomalies?"
image : ${input.dashboard_screenshot}
Text Classification & Reranking
Classify text or rerank search results for better relevance.
Models
Reranking:
@cf/baai/bge-reranker-base - Semantic similarity scoring
Sentiment Analysis:
@cf/huggingface/distilbert-sst-2-int8 - Positive/negative classification
Rerank Search Results
agents :
- name : initial-search
operation : storage
config :
action : vectorize-query
index : documents
vector : ${query-embedding.output}
topK : 20
- name : rerank
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-reranker-base'
query : ${input.query}
documents : ${initial-search.output.matches}
topK : 5
Sentiment Analysis
agents :
- name : classify-sentiment
operation : think
config :
provider : workers-ai
model : '@cf/huggingface/distilbert-sst-2-int8'
prompt : ${input.review_text}
Output:
{
"label" : "POSITIVE" ,
"score" : 0.94
}
Complete Examples
Semantic Search with Reranking
ensemble : semantic-search
agents :
# Generate query embedding
- name : embed-query
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-base-en-v1.5'
prompt : ${input.query}
# Initial vector search (broad)
- name : vector-search
operation : storage
config :
action : vectorize-query
index : knowledge-base
vector : ${embed-query.output}
topK : 20
# Rerank for precision
- name : rerank
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-reranker-base'
query : ${input.query}
documents : ${vector-search.output.matches}
topK : 5
Image Upload Pipeline
ensemble : process-upload
agents :
# Classify image
- name : classify
operation : think
config :
provider : workers-ai
model : '@cf/microsoft/resnet-50'
image : ${input.image_url}
# Detect objects
- name : detect
operation : think
config :
provider : workers-ai
model : '@cf/facebook/detr-resnet-50'
image : ${input.image_url}
# Generate caption
- name : caption
operation : think
config :
provider : workers-ai
model : '@cf/llava-hf/llava-1.5-7b-hf'
prompt : "Generate a descriptive caption for this image"
image : ${input.image_url}
# Store metadata
- name : store
operation : storage
config :
action : d1-insert
table : images
data :
url : ${input.image_url}
category : ${classify.output.predictions[0].label}
objects : ${detect.output.objects}
caption : ${caption.output}
Visual Document Processing
ensemble : process-document
agents :
# Extract text with OCR
- name : ocr
operation : think
config :
provider : workers-ai
model : '@cf/meta/llama-3.2-11b-vision-instruct'
prompt : "Extract all text from this document, maintaining structure"
image : ${input.document_image}
# Generate embedding of content
- name : embed
operation : think
config :
provider : workers-ai
model : '@cf/baai/bge-base-en-v1.5'
prompt : ${ocr.output}
# Store in vector database
- name : index
operation : storage
config :
action : vectorize-insert
index : documents
vectors :
- id : ${input.document_id}
values : ${embed.output}
metadata :
text : ${ocr.output}
image_url : ${input.document_image}
Best Practices
Model Selection
Embeddings:
English-only → bge-base-en-v1.5
Multilingual → bge-m3
Speed critical → bge-small-en-v1.5
Max accuracy → bge-large-en-v1.5
Vision:
Simple classification → resnet-50
Object detection → detr-resnet-50
Image Q&A → llava-1.5-7b-hf
Complex reasoning → llama-3.2-vision or gemma-3
Caching
Workers AI responses can be cached:
agents :
- name : classify
operation : think
config :
provider : workers-ai
model : '@cf/microsoft/resnet-50'
image : ${input.image}
cache : true
cacheTTL : 3600
Error Handling
agents :
- name : detect
operation : think
config :
provider : workers-ai
model : '@cf/facebook/detr-resnet-50'
image : ${input.image}
retry :
maxAttempts : 3
backoff : exponential
- name : handle-failure
condition : ${!detect.success}
operation : code
config :
script : scripts/handle-detection-failure
// scripts/handle-detection-failure.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'
export default function handleDetectionFailure ( _context : AgentExecutionContext ) {
return {
error: 'Object detection failed' ,
fallback: true
}
}
Batch requests when possible
Use smaller models for simple tasks
Cache embeddings for repeated queries
Parallelize independent operations
Choose appropriate dimensions (smaller = faster + cheaper storage)
Limitations
Free Tier:
10,000 requests/day
Rate limits apply
Image Requirements:
Max size varies by model
Supported formats: JPEG, PNG, WebP
Must be accessible URLs or base64
Model Availability:
Next Steps
think Operation Full think operation reference
storage Operation Store embeddings in Vectorize
RAG Pipeline Complete RAG example
Workers AI Docs Cloudflare Workers AI documentation