think operation using the workers-ai provider.
Overview
Workers AI provides serverless GPU inference for ML models:- Free tier: 10,000 requests/day
- Latency: Runs at edge, near your users
- Provider: Use
workers-aiinthinkoperation - Binding: Requires
[ai]binding in wrangler.toml
- Text Embeddings (7 models)
- Image Classification (1 model)
- Object Detection (1 model)
- Image-to-Text (2 models)
- Vision Models (2 multimodal LLMs)
- Text Classification (2 models)
Configuration
wrangler.toml
Environment Variable
SetCONDUCTOR_AI_PROVIDER=workers-ai or configure per-agent.
Text Embeddings
Convert text into vector representations for semantic search, RAG, clustering, and similarity tasks.Available Models
English Models (BGE):@cf/baai/bge-small-en-v1.5- 384 dimensions, fastest@cf/baai/bge-base-en-v1.5- 768 dimensions, balanced@cf/baai/bge-large-en-v1.5- 1024 dimensions, most accurate
@cf/baai/bge-m3- 1024 dims, 100+ languages, multi-vector retrieval
@cf/google/embeddinggemma-300m- From Gemma 3, 100+ languages@cf/pfnet/plamo-embedding-1b- Japanese text@cf/qwen/qwen3-embedding-0.6b- Chinese/multilingual
Generate Embeddings
Store in Vectorize
Semantic Search
Choosing an Embedding Model
Use bge-small-en-v1.5 when:- Speed is critical
- Low latency required
- English-only content
- Cost-sensitive (fewer dimensions = cheaper storage)
- Balanced performance needed
- General-purpose embeddings
- English content with some multilingual
- Maximum accuracy required
- Complex semantic understanding
- Willing to trade speed for quality
- Multilingual content (100+ languages)
- Need multi-vector retrieval
- Cross-language search
Image Classification
Classify images into categories using ResNet-50.Model
@cf/microsoft/resnet-50- 1000 ImageNet classes
Classify Image
Use Cases
Content Moderation:Object Detection
Detect objects in images with bounding boxes and class labels.Model
@cf/facebook/detr-resnet-50- Detection Transformer
Detect Objects
Use Cases
Count Objects:Image-to-Text
Generate text descriptions or answers from images.Models
@cf/llava-hf/llava-1.5-7b-hf- Vision Q&A and captioning@cf/unum/uform-gen2-qwen-500m- Lightweight image-to-text
Generate Caption
Image Q&A
Use Cases
Accessibility:Vision Models (Multimodal LLMs)
Advanced vision understanding using multimodal language models.Models
@cf/meta/llama-3.2-11b-vision-instruct- Llama with vision@cf/google/gemma-3-12b-it- Gemma with image support
Visual Reasoning
Document OCR
Visual Q&A with Context
Use Cases
Invoice Processing:Text Classification & Reranking
Classify text or rerank search results for better relevance.Models
Reranking:@cf/baai/bge-reranker-base- Semantic similarity scoring
@cf/huggingface/distilbert-sst-2-int8- Positive/negative classification
Rerank Search Results
Sentiment Analysis
Complete Examples
Semantic Search with Reranking
Image Upload Pipeline
Visual Document Processing
Best Practices
Model Selection
Embeddings:- English-only → bge-base-en-v1.5
- Multilingual → bge-m3
- Speed critical → bge-small-en-v1.5
- Max accuracy → bge-large-en-v1.5
- Simple classification → resnet-50
- Object detection → detr-resnet-50
- Image Q&A → llava-1.5-7b-hf
- Complex reasoning → llama-3.2-vision or gemma-3
Caching
Workers AI responses can be cached:Error Handling
Performance Tips
- Batch requests when possible
- Use smaller models for simple tasks
- Cache embeddings for repeated queries
- Parallelize independent operations
- Choose appropriate dimensions (smaller = faster + cheaper storage)
Limitations
Free Tier:- 10,000 requests/day
- Rate limits apply
- Max size varies by model
- Supported formats: JPEG, PNG, WebP
- Must be accessible URLs or base64
- Some models may be in beta
- Check Cloudflare Workers AI docs for latest
Next Steps
think Operation
Full think operation reference
storage Operation
Store embeddings in Vectorize
RAG Pipeline
Complete RAG example
Workers AI Docs
Cloudflare Workers AI documentation

