Skip to main content

Your First Agent

Agents are reusable workers that encapsulate logic. Think of them as functions you can version, deploy, and compose. Pre-built agents (scraper, validator, rag, hitl) are great, but sometimes you need custom logic. That’s where building your own agents comes in.

What’s an Agent?

An agent is a reusable unit of work with:
  • Inputs: Parameters it accepts
  • Logic: What it does (operations you declare)
  • Outputs: Data it returns
Agents can use any operation (think, code, storage, http, etc.) and can be versioned independently with Edgit.

Create Your First Agent

Let’s build a company-enricher agent that takes a company name and enriches it with data.

Project Structure

my-conductor-app/
  agents/
    company-enricher/
      agent.yaml      # Agent definition
  ensembles/
    enrich-company.yaml # Ensemble using the agent

Define the Agent

Create agents/company-enricher/agent.yaml:
agent: company-enricher
description: Enriches company data from web sources

inputs:
  company_name:
    type: string
    required: true
  include_news:
    type: boolean
    default: false

operations:
  # Search for company website
  - name: search
    operation: http
    config:
      url: https://api.duckduckgo.com/?q=${input.company_name}+official+website&format=json
      method: GET
    cache:
      ttl: 86400  # Cache for 24 hours

  # Scrape company website
  - name: scrape
    operation: http
    config:
      url: ${search.output.body.AbstractURL}
      method: GET

  # Extract structured data with AI
  - name: extract
    operation: think
    config:
      provider: cloudflare
      model: '@cf/meta/llama-3.1-8b-instruct'
      prompt: |
        Extract company information from this HTML:
        ${scrape.output.body}

        Return JSON with:
        - name: Company name
        - description: Brief description (1-2 sentences)
        - industry: Primary industry
        - founded: Year founded (if available)

  # Optionally fetch news
  - name: fetch-news
    operation: http
    condition: ${input.include_news}
    config:
      url: https://api.example.com/news?company=${input.company_name}
      method: GET

outputs:
  company_data: ${extract.output}
  news: ${fetch-news.output.body}
  source_url: ${search.output.body.AbstractURL}
That’s it. Your agent is defined declaratively. No classes, no boilerplate.

Use Your Agent

Create ensembles/enrich-company.yaml:
ensemble: enrich-company
description: Enrich company data

agents:
  - name: enricher
    agent: company-enricher
    inputs:
      company_name: ${input.company}
      include_news: true

output:
  data: ${enricher.output.company_data}
  news: ${enricher.output.news}
  source: ${enricher.output.source_url}

Execute It

import { Conductor } from '@ensemble-edge/conductor';

const conductor = new Conductor({ env });

const result = await conductor.execute('enrich-company', {
  company: 'Anthropic'
});

console.log(result);
Result:
{
  "data": {
    "name": "Anthropic",
    "description": "AI safety company building reliable, interpretable AI systems",
    "industry": "Artificial Intelligence",
    "founded": 2021
  },
  "news": [...],
  "source": "https://www.anthropic.com"
}

Agent Patterns

Pattern 1: Data Processor

Transform and validate data:
agent: data-processor

inputs:
  raw_data:
    type: object
    required: true

operations:
  # Validate structure
  - name: validate
    operation: code
    config:
      code: |
        const data = ${input.raw_data};
        const isValid = data.id && data.name && data.email;
        return { valid: isValid, errors: isValid ? [] : ['Missing required fields'] };

  # Transform data
  - name: transform
    operation: code
    condition: ${validate.output.valid}
    config:
      code: |
        const data = ${input.raw_data};
        return {
          id: data.id,
          name: data.name.trim().toLowerCase(),
          email: data.email.toLowerCase(),
          created_at: Date.now()
        };

  # Store in database
  - name: store
    operation: storage
    condition: ${validate.output.valid}
    config:
      type: d1
      query: |
        INSERT INTO users (id, name, email, created_at)
        VALUES (?, ?, ?, ?)
      params:
        - ${transform.output.id}
        - ${transform.output.name}
        - ${transform.output.email}
        - ${transform.output.created_at}

outputs:
  success: ${validate.output.valid}
  data: ${transform.output}
  errors: ${validate.output.errors}

Pattern 2: AI Pipeline

Chain AI operations:
agent: content-analyzer

inputs:
  text:
    type: string
    required: true

operations:
  # Step 1: Extract entities
  - name: extract-entities
    operation: think
    config:
      provider: openai
      model: gpt-4o-mini
      prompt: |
        Extract named entities from: ${input.text}
        Return JSON: {"people": [], "organizations": [], "locations": []}

  # Step 2: Analyze sentiment
  - name: analyze-sentiment
    operation: think
    config:
      provider: openai
      model: gpt-4o-mini
      prompt: |
        Analyze sentiment of: ${input.text}
        Return JSON: {"sentiment": "positive|negative|neutral", "confidence": 0-1}

  # Step 3: Generate summary
  - name: summarize
    operation: think
    config:
      provider: openai
      model: gpt-4o-mini
      prompt: |
        Summarize in 2 sentences: ${input.text}

outputs:
  entities: ${extract-entities.output}
  sentiment: ${analyze-sentiment.output}
  summary: ${summarize.output}

Pattern 3: API Orchestrator

Coordinate multiple API calls:
agent: user-profile-aggregator

inputs:
  user_id:
    type: string
    required: true

operations:
  # Fetch user data from service 1
  - name: fetch-profile
    operation: http
    config:
      url: https://api1.example.com/users/${input.user_id}
      method: GET
      headers:
        Authorization: Bearer ${env.API1_TOKEN}

  # Fetch activity from service 2
  - name: fetch-activity
    operation: http
    config:
      url: https://api2.example.com/activity?user=${input.user_id}
      method: GET
      headers:
        Authorization: Bearer ${env.API2_TOKEN}

  # Fetch preferences from service 3
  - name: fetch-preferences
    operation: http
    config:
      url: https://api3.example.com/preferences/${input.user_id}
      method: GET

  # Merge data
  - name: merge
    operation: code
    config:
      code: |
        return {
          profile: ${fetch-profile.output.body},
          activity: ${fetch-activity.output.body},
          preferences: ${fetch-preferences.output.body},
          merged_at: Date.now()
        };

outputs:
  user_data: ${merge.output}

Advanced Features

Caching

Cache expensive operations:
operations:
  - name: expensive-ai-call
    operation: think
    config:
      provider: openai
      model: gpt-4
      prompt: ${input.query}
    cache:
      ttl: 3600  # Cache for 1 hour
      key: ai-${input.query}  # Custom cache key

Retry Logic

Automatic retries on failure:
operations:
  - name: flaky-api
    operation: http
    config:
      url: https://api.example.com/data
      method: GET
    retry:
      maxAttempts: 3
      backoff: exponential
      initialDelay: 1000  # 1 second

Error Handling

Graceful failure handling:
operations:
  - name: try-primary
    operation: http
    config:
      url: https://primary-api.com/data
      method: GET

  - name: fallback
    operation: http
    condition: ${try-primary.failed}
    config:
      url: https://backup-api.com/data
      method: GET

outputs:
  data: ${try-primary.output.body || fallback.output.body}
  source: ${try-primary.executed ? 'primary' : 'fallback'}

State Management

Share state between operations:
agent: stateful-processor

state:
  schema:
    processed_count: number
    last_processed: string

operations:
  - name: process
    operation: code
    config:
      code: |
        const count = ${state.processed_count || 0};
        return {
          new_count: count + 1,
          timestamp: new Date().toISOString()
        };
    state:
      use: [processed_count]
      set:
        processed_count: ${process.output.new_count}
        last_processed: ${process.output.timestamp}

outputs:
  count: ${process.output.new_count}

Versioning Agents

Once your agent works, version it with Edgit:
# Register agent
edgit components add agent company-enricher agents/company-enricher/ --type=agent

# Create version
edgit tag create company-enricher v1.0.0 --type=agent

# Deploy to production
edgit deploy set company-enricher v1.0.0 --to prod --type=agent
Use specific versions in ensembles:
ensemble: enrich-company

agents:
  - name: enricher
    agent: company-enricher@v1.0.0  # Locked to v1.0.0
    inputs:
      company_name: ${input.company}

Testing Agents

Test agents with Vitest:
// agents/company-enricher/agent.test.ts
import { describe, it, expect } from 'vitest';
import { TestConductor } from '@ensemble-edge/conductor/testing';

describe('company-enricher agent', () => {
  it('should enrich company data', async () => {
    const conductor = await TestConductor.create();
    await conductor.loadProject('./');

    const result = await conductor.executeAgent('company-enricher', {
      company_name: 'Anthropic',
      include_news: false
    });

    expect(result).toBeSuccessful();
    expect(result.output.company_data).toHaveProperty('name');
    expect(result.output.company_data).toHaveProperty('description');
    expect(result.output.company_data).toHaveProperty('industry');
  });

  it('should include news when requested', async () => {
    const conductor = await TestConductor.create();
    await conductor.loadProject('./');

    const result = await conductor.executeAgent('company-enricher', {
      company_name: 'Anthropic',
      include_news: true
    });

    expect(result).toBeSuccessful();
    expect(result.output.news).toBeDefined();
  });

  it('should cache search results', async () => {
    const conductor = await TestConductor.create();
    await conductor.loadProject('./');

    // First call
    const result1 = await conductor.executeAgent('company-enricher', {
      company_name: 'Anthropic'
    });

    // Second call (should use cache)
    const result2 = await conductor.executeAgent('company-enricher', {
      company_name: 'Anthropic'
    });

    expect(result1.operations.search.cached).toBe(false);
    expect(result2.operations.search.cached).toBe(true);
  });
});
Run tests:
npm test

Agent Composition

Agents can use other agents:
agent: full-company-profile

inputs:
  company_name:
    type: string
    required: true

operations:
  # Use company-enricher agent
  - name: enrich
    agent: company-enricher
    inputs:
      company_name: ${input.company_name}
      include_news: true

  # Use another agent for social data
  - name: social
    agent: social-scraper
    inputs:
      company_name: ${input.company_name}

  # Merge results
  - name: merge
    operation: code
    config:
      code: |
        return {
          ...${enrich.output.company_data},
          social: ${social.output},
          news: ${enrich.output.news}
        };

outputs:
  profile: ${merge.output}

Best Practices

  1. Single Responsibility - Each agent should do one thing well
  2. Clear Inputs/Outputs - Document what goes in and what comes out
  3. Cache Aggressively - Cache expensive operations (AI, HTTP)
  4. Handle Failures - Use conditions and fallbacks
  5. Version Everything - Use Edgit to version your agents
  6. Test Thoroughly - Write tests for all code paths
  7. Keep It Declarative - Let Conductor handle orchestration
  8. Monitor Performance - Track execution times and costs

Next Steps