Schema Components - Ensemble Edge

Define reusable JSON Schemas for structured AI outputs with Claude’s native JSON Schema support.

Overview

Schema components enable you to:

Get structured outputs from AI models using JSON Schema
Version schemas with edgit for consistency
Share schemas across multiple agents and ensembles
Validate outputs against well-defined schemas

Quick Start

1. Create a Schema Component

Create a JSON Schema file:

// schemas/contact.json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Contact",
  "description": "Contact information schema",
  "type": "object",
  "properties": {
    "first_name": {
      "type": "string",
      "description": "Contact's first name"
    },
    "last_name": {
      "type": "string",
      "description": "Contact's last name"
    },
    "email": {
      "type": "string",
      "format": "email",
      "description": "Email address"
    },
    "phone": {
      "type": "string",
      "description": "Phone number"
    },
    "company": {
      "type": "string",
      "description": "Company name"
    }
  },
  "required": ["first_name", "last_name"],
  "anyOf": [
    {"required": ["email"]},
    {"required": ["phone"]}
  ]
}

2. Reference the Schema

Use the schema in your ensemble:

name: contact-extractor

flow:
  - agent: extract

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: |
        Extract contact information from: ${input.text}
      # Reference the schema component
      schema: "schema://[email protected]"

inputs:
  text:
    type: string
    required: true

outputs:
  contact: ${extract.output}

3. Get Structured Output

curl -X POST https://your-worker.workers.dev/api/v1/execute/ensemble/contact-extractor \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "text": "John Smith, [email protected], works at Acme Corp"
    }
  }'

Response:

{
  "success": true,
  "output": {
    "first_name": "John",
    "last_name": "Smith",
    "email": "[email protected]",
    "company": "Acme Corp"
  }
}

How to Reference in Ensembles

There are three ways to reference schemas in your ensembles:

1. URI Format (Recommended)

Use the schema:// URI format to reference versioned schema components:

ensemble: invoice-extractor

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: |
        Extract invoice data from: ${input.text}
      schema: "schema://[email protected]"

inputs:
  text:
    type: string

outputs:
  invoice_data: ${extract.output}

2. Template Expression Format

Use ${components.schema_name@version} to embed schema references:

ensemble: multi-field-extractor

agents:
  - name: extract-structured
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      prompt: "Extract data: ${input.text}"
      schema:
        type: object
        properties:
          # Embed schema definitions from components
          contact: ${components.contact-schema@v1}
          address: ${components.address-schema@v1}

inputs:
  text:
    type: string

outputs:
  data: ${extract-structured.output}

3. Inline Schema

For simple operations or during development, define the schema directly in YAML:

ensemble: sentiment-classifier

agents:
  - name: classify
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: "Classify the sentiment: ${input.text}"
      schema:
        type: object
        properties:
          category:
            type: string
            enum: [positive, negative, neutral]
          confidence:
            type: number
            minimum: 0
            maximum: 1
        required: [category, confidence]

inputs:
  text:
    type: string

outputs:
  sentiment: ${classify.output}

Example Schemas

Invoice Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Invoice",
  "description": "Invoice data extraction schema",
  "type": "object",
  "properties": {
    "invoice_number": {
      "type": "string",
      "description": "Invoice number or ID"
    },
    "invoice_date": {
      "type": "string",
      "format": "date",
      "description": "Invoice date (YYYY-MM-DD)"
    },
    "vendor": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "address": {"type": "string"},
        "email": {"type": "string", "format": "email"}
      },
      "required": ["name"]
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": {"type": "string"},
          "quantity": {"type": "number"},
          "unit_price": {"type": "number"},
          "total": {"type": "number"}
        },
        "required": ["description", "quantity", "unit_price", "total"]
      }
    },
    "total": {
      "type": "number",
      "description": "Total amount due"
    }
  },
  "required": ["invoice_number", "invoice_date", "vendor", "line_items", "total"]
}

Sentiment Analysis Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "SentimentAnalysis",
  "type": "object",
  "properties": {
    "sentiment": {
      "type": "string",
      "enum": ["positive", "negative", "neutral", "mixed"],
      "description": "Overall sentiment"
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score (0-1)"
    },
    "emotions": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "emotion": {
            "type": "string",
            "enum": ["joy", "sadness", "anger", "fear", "surprise", "disgust"]
          },
          "intensity": {
            "type": "number",
            "minimum": 0,
            "maximum": 1
          }
        },
        "required": ["emotion", "intensity"]
      }
    },
    "key_phrases": {
      "type": "array",
      "items": {"type": "string"}
    },
    "reasoning": {
      "type": "string",
      "description": "Explanation of the sentiment analysis"
    }
  },
  "required": ["sentiment", "confidence", "reasoning"]
}

Schema Versioning

Schema components follow semantic versioning:

# Version a schema
edgit tag create invoice v1.0.0

# Update schema
edgit tag create invoice v1.1.0

# Tag as production
edgit tag set invoice production v1.1.0

In YAML:

schema: schemas/[email protected]     # Specific version (immutable)
schema: schemas/invoice@v1         # Major version (gets latest v1.x.x)
schema: schemas/invoice@production # Tagged version
schema: schemas/invoice@latest     # Always latest

Provider Support

Anthropic (Claude)

Claude natively supports JSON Schema for structured outputs:

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4  # Full JSON Schema support
      schema: schemas/invoice@v1

Models with schema support:

claude-opus-4 - Best for complex schemas
claude-sonnet-4 - Balanced performance
claude-haiku-4 - Fast for simple schemas

OpenAI

OpenAI supports structured outputs via function calling:

agents:
  - name: extract
    operation: think
    config:
      provider: openai
      model: gpt-4-turbo
      schema: schemas/contact@v1

Cloudflare Workers AI

agents:
  - name: extract
    operation: think
    config:
      provider: cloudflare
      model: "@cf/meta/llama-3-8b-instruct"
      schema: schemas/simple@v1  # Use simpler schemas for smaller models

Best Practices

1. Use Descriptive Field Names

{
  "properties": {
    "invoice_number": {"type": "string"},      // Good: Clear and specific
    "num": {"type": "string"}                  // Bad: Ambiguous
  }
}

2. Add Descriptions

{
  "properties": {
    "total": {
      "type": "number",
      "description": "Total amount in USD, including tax"  // Helps AI understand
    }
  }
}

3. Use Enums for Categories

{
  "properties": {
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high", "critical"],  // Constrains output
      "description": "Task priority level"
    }
  }
}

4. Set Constraints

{
  "properties": {
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score between 0 and 1"
    },
    "email": {
      "type": "string",
      "format": "email"  // Built-in validation
    }
  }
}

5. Version Schemas Carefully

# Production ensemble
schema: schemas/invoice@production

# Development ensemble
schema: schemas/invoice@latest

# Specific version for reproducibility
schema: schemas/[email protected]

6. Test with Different Temperature

agents:
  - name: extract-structured
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1  # Low for accurate extraction
      schema: schemas/invoice@v1

Common Patterns

Extraction with Validation

name: validated-extractor

flow:
  - agent: extract
  - agent: validate

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      schema: schemas/contact@v1
      prompt: "Extract contact from: ${input.text}"

  - name: validate
    operation: code
    config:
      script: scripts/validate-contact
    input:
      contact: ${extract.output}

outputs:
  contact: ${validate.output.contact}

// scripts/validate-contact.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default function validateContact(context: AgentExecutionContext) {
  const { contact } = context.input

  // Additional validation
  if (!contact.email && !contact.phone) {
    throw new Error('Contact must have email or phone')
  }

  return { valid: true, contact }
}

Multi-Schema Workflow

name: invoice-processor

flow:
  - agent: extract-invoice
  - agent: extract-vendor
  - agent: extract-line-items

agents:
  - name: extract-invoice
    operation: think
    config:
      schema: schemas/invoice@v1
      prompt: "Extract invoice header: ${input.document}"

  - name: extract-vendor
    operation: think
    config:
      schema: schemas/vendor@v1
      prompt: "Extract vendor details: ${input.document}"

  - name: extract-line-items
    operation: think
    config:
      schema: schemas/line-items@v1
      prompt: "Extract line items: ${input.document}"

outputs:
  invoice: ${extract-invoice.output}
  vendor: ${extract-vendor.output}
  line_items: ${extract-line-items.output}

Using ctx API in Agents

When building custom agents with TypeScript handlers, you can access schemas through the ctx API:

ctx.schemas.get(name)

Get a schema by name:

// agents/validator/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function validate(ctx: AgentExecutionContext) {
  // Get schema by name
  const contactSchema = await ctx.schemas.get('contact')

  return {
    schema: contactSchema,
    fields: Object.keys(contactSchema.properties || {})
  }
}

ctx.schemas.validate(name, data)

Validate data against a schema. Throws an error if validation fails:

// agents/validator/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function validateContact(ctx: AgentExecutionContext) {
  const { contact } = ctx.input as { contact: any }

  try {
    // Validate and get validated data
    const validated = await ctx.schemas.validate('contact', contact)

    return {
      success: true,
      data: validated
    }
  } catch (error) {
    return {
      success: false,
      error: error instanceof Error ? error.message : 'Validation failed'
    }
  }
}

ctx.schemas.isValid(name, data)

Check if data is valid without throwing errors (returns boolean):

// agents/checker/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function checkContact(ctx: AgentExecutionContext) {
  const { contact } = ctx.input as { contact: any }

  // Check validity without throwing
  const isValid = await ctx.schemas.isValid('contact', contact)

  return {
    valid: isValid,
    data: contact
  }
}

Complete Example

// agents/contact-processor/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

interface ContactInput {
  contacts: Array<Record<string, any>>
}

export default async function processContacts(ctx: AgentExecutionContext) {
  const { contacts } = ctx.input as ContactInput

  const results = await Promise.all(
    contacts.map(async (contact) => {
      const isValid = await ctx.schemas.isValid('contact', contact)

      if (isValid) {
        return { contact, status: 'valid' }
      } else {
        return { contact, status: 'invalid' }
      }
    })
  )

  return {
    total: contacts.length,
    valid: results.filter(r => r.status === 'valid').length,
    invalid: results.filter(r => r.status === 'invalid').length,
    results
  }
}

Troubleshooting

Schema Not Found

Error: Failed to resolve schema "schemas/invoice@v1" Solution:

Check schema exists: edgit list schemas
Verify version: edgit tag list invoice
Use local file path for testing: ./schemas/invoice.json

Invalid JSON Output

Issue: AI returns text instead of JSON Solutions:

Lower temperature: temperature: 0.1
Use more capable model: claude-sonnet-4 or claude-opus-4
Add explicit instruction in prompt:

prompt: |
  Extract contact information and return ONLY valid JSON matching the schema.

  Text: ${input.text}

Schema Too Complex

Issue: AI struggles with complex nested schemas Solutions:

Break into multiple agents with simpler schemas
Use claude-opus-4 for complex schemas
Simplify schema structure
Provide examples in the prompt

Next Steps

Think Operation

Learn about AI reasoning operations

Component Versioning

Version control for components

JSON Schema

JSON Schema specification

Playbooks

Browse real-world examples

Conductor

Getting Started

Core Concepts

Building

Components

Operations Reference

Plugins

Starter Kit

Playbooks

Reference

​Overview

​Quick Start

​1. Create a Schema Component

​2. Reference the Schema

​3. Get Structured Output

​How to Reference in Ensembles

​1. URI Format (Recommended)

​2. Template Expression Format

​3. Inline Schema

​Example Schemas

​Invoice Schema

​Sentiment Analysis Schema

​Schema Versioning

​Provider Support

​Anthropic (Claude)

​OpenAI

​Cloudflare Workers AI

​Best Practices

​1. Use Descriptive Field Names

​2. Add Descriptions

​3. Use Enums for Categories

​4. Set Constraints

​5. Version Schemas Carefully

​6. Test with Different Temperature

​Common Patterns

​Extraction with Validation

​Multi-Schema Workflow

​Using ctx API in Agents

​ctx.schemas.get(name)

​ctx.schemas.validate(name, data)

​ctx.schemas.isValid(name, data)

​Complete Example

​Troubleshooting

​Schema Not Found

​Invalid JSON Output

​Schema Too Complex

​Next Steps

Think Operation

Component Versioning

JSON Schema

Playbooks

Overview

Quick Start

1. Create a Schema Component

2. Reference the Schema

3. Get Structured Output

How to Reference in Ensembles

1. URI Format (Recommended)

2. Template Expression Format

3. Inline Schema

Example Schemas

Invoice Schema

Sentiment Analysis Schema

Schema Versioning

Provider Support

Anthropic (Claude)

OpenAI

Cloudflare Workers AI

Best Practices

1. Use Descriptive Field Names

2. Add Descriptions

3. Use Enums for Categories

4. Set Constraints

5. Version Schemas Carefully

6. Test with Different Temperature

Common Patterns

Extraction with Validation

Multi-Schema Workflow

Using ctx API in Agents

ctx.schemas.get(name)

ctx.schemas.validate(name, data)

ctx.schemas.isValid(name, data)

Complete Example

Troubleshooting

Schema Not Found

Invalid JSON Output

Schema Too Complex

Next Steps