Skip to main content
Define reusable JSON Schemas for structured AI outputs with Claude’s native JSON Schema support.

Overview

Schema components enable you to:
  • Get structured outputs from AI models using JSON Schema
  • Version schemas with edgit for consistency
  • Share schemas across multiple agents and ensembles
  • Validate outputs against well-defined schemas

Quick Start

1. Create a Schema Component

Create a JSON Schema file:
// schemas/contact.json
{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Contact",
  "description": "Contact information schema",
  "type": "object",
  "properties": {
    "first_name": {
      "type": "string",
      "description": "Contact's first name"
    },
    "last_name": {
      "type": "string",
      "description": "Contact's last name"
    },
    "email": {
      "type": "string",
      "format": "email",
      "description": "Email address"
    },
    "phone": {
      "type": "string",
      "description": "Phone number"
    },
    "company": {
      "type": "string",
      "description": "Company name"
    }
  },
  "required": ["first_name", "last_name"],
  "anyOf": [
    {"required": ["email"]},
    {"required": ["phone"]}
  ]
}

2. Reference the Schema

Use the schema in your ensemble:
name: contact-extractor

flow:
  - agent: extract

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: |
        Extract contact information from: ${input.text}
      # Reference the schema component
      schema: "schema://[email protected]"

inputs:
  text:
    type: string
    required: true

outputs:
  contact: ${extract.output}

3. Get Structured Output

curl -X POST https://your-worker.workers.dev/api/v1/execute/ensemble/contact-extractor \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "input": {
      "text": "John Smith, [email protected], works at Acme Corp"
    }
  }'
Response:
{
  "success": true,
  "output": {
    "first_name": "John",
    "last_name": "Smith",
    "email": "[email protected]",
    "company": "Acme Corp"
  }
}

How to Reference in Ensembles

There are three ways to reference schemas in your ensembles: Use the schema:// URI format to reference versioned schema components:
ensemble: invoice-extractor

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: |
        Extract invoice data from: ${input.text}
      schema: "schema://[email protected]"

inputs:
  text:
    type: string

outputs:
  invoice_data: ${extract.output}

2. Template Expression Format

Use ${components.schema_name@version} to embed schema references:
ensemble: multi-field-extractor

agents:
  - name: extract-structured
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      prompt: "Extract data: ${input.text}"
      schema:
        type: object
        properties:
          # Embed schema definitions from components
          contact: ${components.contact-schema@v1}
          address: ${components.address-schema@v1}

inputs:
  text:
    type: string

outputs:
  data: ${extract-structured.output}

3. Inline Schema

For simple operations or during development, define the schema directly in YAML:
ensemble: sentiment-classifier

agents:
  - name: classify
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1
      prompt: "Classify the sentiment: ${input.text}"
      schema:
        type: object
        properties:
          category:
            type: string
            enum: [positive, negative, neutral]
          confidence:
            type: number
            minimum: 0
            maximum: 1
        required: [category, confidence]

inputs:
  text:
    type: string

outputs:
  sentiment: ${classify.output}

Example Schemas

Invoice Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "Invoice",
  "description": "Invoice data extraction schema",
  "type": "object",
  "properties": {
    "invoice_number": {
      "type": "string",
      "description": "Invoice number or ID"
    },
    "invoice_date": {
      "type": "string",
      "format": "date",
      "description": "Invoice date (YYYY-MM-DD)"
    },
    "vendor": {
      "type": "object",
      "properties": {
        "name": {"type": "string"},
        "address": {"type": "string"},
        "email": {"type": "string", "format": "email"}
      },
      "required": ["name"]
    },
    "line_items": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "description": {"type": "string"},
          "quantity": {"type": "number"},
          "unit_price": {"type": "number"},
          "total": {"type": "number"}
        },
        "required": ["description", "quantity", "unit_price", "total"]
      }
    },
    "total": {
      "type": "number",
      "description": "Total amount due"
    }
  },
  "required": ["invoice_number", "invoice_date", "vendor", "line_items", "total"]
}

Sentiment Analysis Schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "title": "SentimentAnalysis",
  "type": "object",
  "properties": {
    "sentiment": {
      "type": "string",
      "enum": ["positive", "negative", "neutral", "mixed"],
      "description": "Overall sentiment"
    },
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score (0-1)"
    },
    "emotions": {
      "type": "array",
      "items": {
        "type": "object",
        "properties": {
          "emotion": {
            "type": "string",
            "enum": ["joy", "sadness", "anger", "fear", "surprise", "disgust"]
          },
          "intensity": {
            "type": "number",
            "minimum": 0,
            "maximum": 1
          }
        },
        "required": ["emotion", "intensity"]
      }
    },
    "key_phrases": {
      "type": "array",
      "items": {"type": "string"}
    },
    "reasoning": {
      "type": "string",
      "description": "Explanation of the sentiment analysis"
    }
  },
  "required": ["sentiment", "confidence", "reasoning"]
}

Schema Versioning

Schema components follow semantic versioning:
# Version a schema
edgit tag create invoice v1.0.0

# Update schema
edgit tag create invoice v1.1.0

# Tag as production
edgit tag set invoice production v1.1.0
In YAML:
schema: schemas/[email protected]     # Specific version (immutable)
schema: schemas/invoice@v1         # Major version (gets latest v1.x.x)
schema: schemas/invoice@production # Tagged version
schema: schemas/invoice@latest     # Always latest

Provider Support

Anthropic (Claude)

Claude natively supports JSON Schema for structured outputs:
agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4  # Full JSON Schema support
      schema: schemas/invoice@v1
Models with schema support:
  • claude-opus-4 - Best for complex schemas
  • claude-sonnet-4 - Balanced performance
  • claude-haiku-4 - Fast for simple schemas

OpenAI

OpenAI supports structured outputs via function calling:
agents:
  - name: extract
    operation: think
    config:
      provider: openai
      model: gpt-4-turbo
      schema: schemas/contact@v1

Cloudflare Workers AI

agents:
  - name: extract
    operation: think
    config:
      provider: cloudflare
      model: "@cf/meta/llama-3-8b-instruct"
      schema: schemas/simple@v1  # Use simpler schemas for smaller models

Best Practices

1. Use Descriptive Field Names

{
  "properties": {
    "invoice_number": {"type": "string"},      // Good: Clear and specific
    "num": {"type": "string"}                  // Bad: Ambiguous
  }
}

2. Add Descriptions

{
  "properties": {
    "total": {
      "type": "number",
      "description": "Total amount in USD, including tax"  // Helps AI understand
    }
  }
}

3. Use Enums for Categories

{
  "properties": {
    "priority": {
      "type": "string",
      "enum": ["low", "medium", "high", "critical"],  // Constrains output
      "description": "Task priority level"
    }
  }
}

4. Set Constraints

{
  "properties": {
    "confidence": {
      "type": "number",
      "minimum": 0,
      "maximum": 1,
      "description": "Confidence score between 0 and 1"
    },
    "email": {
      "type": "string",
      "format": "email"  // Built-in validation
    }
  }
}

5. Version Schemas Carefully

# Production ensemble
schema: schemas/invoice@production

# Development ensemble
schema: schemas/invoice@latest

# Specific version for reproducibility
schema: schemas/[email protected]

6. Test with Different Temperature

agents:
  - name: extract-structured
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      temperature: 0.1  # Low for accurate extraction
      schema: schemas/invoice@v1

Common Patterns

Extraction with Validation

name: validated-extractor

flow:
  - agent: extract
  - agent: validate

agents:
  - name: extract
    operation: think
    config:
      provider: anthropic
      model: claude-sonnet-4
      schema: schemas/contact@v1
      prompt: "Extract contact from: ${input.text}"

  - name: validate
    operation: code
    config:
      script: scripts/validate-contact
    input:
      contact: ${extract.output}

outputs:
  contact: ${validate.output.contact}
// scripts/validate-contact.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default function validateContact(context: AgentExecutionContext) {
  const { contact } = context.input

  // Additional validation
  if (!contact.email && !contact.phone) {
    throw new Error('Contact must have email or phone')
  }

  return { valid: true, contact }
}

Multi-Schema Workflow

name: invoice-processor

flow:
  - agent: extract-invoice
  - agent: extract-vendor
  - agent: extract-line-items

agents:
  - name: extract-invoice
    operation: think
    config:
      schema: schemas/invoice@v1
      prompt: "Extract invoice header: ${input.document}"

  - name: extract-vendor
    operation: think
    config:
      schema: schemas/vendor@v1
      prompt: "Extract vendor details: ${input.document}"

  - name: extract-line-items
    operation: think
    config:
      schema: schemas/line-items@v1
      prompt: "Extract line items: ${input.document}"

outputs:
  invoice: ${extract-invoice.output}
  vendor: ${extract-vendor.output}
  line_items: ${extract-line-items.output}

Using ctx API in Agents

When building custom agents with TypeScript handlers, you can access schemas through the ctx API:

ctx.schemas.get(name)

Get a schema by name:
// agents/validator/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function validate(ctx: AgentExecutionContext) {
  // Get schema by name
  const contactSchema = await ctx.schemas.get('contact')

  return {
    schema: contactSchema,
    fields: Object.keys(contactSchema.properties || {})
  }
}

ctx.schemas.validate(name, data)

Validate data against a schema. Throws an error if validation fails:
// agents/validator/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function validateContact(ctx: AgentExecutionContext) {
  const { contact } = ctx.input as { contact: any }

  try {
    // Validate and get validated data
    const validated = await ctx.schemas.validate('contact', contact)

    return {
      success: true,
      data: validated
    }
  } catch (error) {
    return {
      success: false,
      error: error instanceof Error ? error.message : 'Validation failed'
    }
  }
}

ctx.schemas.isValid(name, data)

Check if data is valid without throwing errors (returns boolean):
// agents/checker/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

export default async function checkContact(ctx: AgentExecutionContext) {
  const { contact } = ctx.input as { contact: any }

  // Check validity without throwing
  const isValid = await ctx.schemas.isValid('contact', contact)

  return {
    valid: isValid,
    data: contact
  }
}

Complete Example

// agents/contact-processor/index.ts
import type { AgentExecutionContext } from '@ensemble-edge/conductor'

interface ContactInput {
  contacts: Array<Record<string, any>>
}

export default async function processContacts(ctx: AgentExecutionContext) {
  const { contacts } = ctx.input as ContactInput

  const results = await Promise.all(
    contacts.map(async (contact) => {
      const isValid = await ctx.schemas.isValid('contact', contact)

      if (isValid) {
        return { contact, status: 'valid' }
      } else {
        return { contact, status: 'invalid' }
      }
    })
  )

  return {
    total: contacts.length,
    valid: results.filter(r => r.status === 'valid').length,
    invalid: results.filter(r => r.status === 'invalid').length,
    results
  }
}

Troubleshooting

Schema Not Found

Error: Failed to resolve schema "schemas/invoice@v1" Solution:
  1. Check schema exists: edgit list schemas
  2. Verify version: edgit tag list invoice
  3. Use local file path for testing: ./schemas/invoice.json

Invalid JSON Output

Issue: AI returns text instead of JSON Solutions:
  1. Lower temperature: temperature: 0.1
  2. Use more capable model: claude-sonnet-4 or claude-opus-4
  3. Add explicit instruction in prompt:
prompt: |
  Extract contact information and return ONLY valid JSON matching the schema.

  Text: ${input.text}

Schema Too Complex

Issue: AI struggles with complex nested schemas Solutions:
  1. Break into multiple agents with simpler schemas
  2. Use claude-opus-4 for complex schemas
  3. Simplify schema structure
  4. Provide examples in the prompt

Next Steps