Workflow and Capabilities

Break complex work into small, auditable, and repeatable steps. Agents gather data, use tools, validate intermediate results, and produce a traceable final output.

01 Flow Types 02 Step Anatomy 03 Data Flow 04 Error Handling 05 Capabilities 06 Observability 07 Production 08 Next

01Flow Types

A pipeline is not limited to a single straight line. Sequential, parallel, conditional, and iterative patterns can be combined based on the dependencies of the work.

Sequential Flow

Each step uses the output of the previous one. Used for dependent processes such as data preparation, analysis, and report generation.

Fetch data

→

Clean

→

Analyze

→

Report

The safest default pattern; easy to debug.

Parallel Fan-Out

Independent subtasks are distributed to different agents at the same time. Useful for breaking large work apart without waiting.

SQL query

→

Log scan

→

Document reading

→

Merge results

Total duration approaches the duration of the slowest subtask.

Conditional Branching

Different routes are selected based on a step's output. Risk, score, data quality, or user approval can become the decision point.

Evaluate

→

Check threshold

→

Select route

→

Apply

Because the rules are explicit, it creates an auditable decision flow.

Iterative Review

Improvement steps run until the output passes a quality threshold. Effective for reports, code review, and classification work.

Generate

→

Check

→

Fix

→

Approve

A retry limit is required to prevent infinite loops.

02Step Anatomy

A production-ready step does more than run a command. It explicitly carries what it accepts, what it produces, which tool it may use with which permission, and what to do when something fails.

Purpose

The step's one-sentence goal and success criteria.

Input

Data from the previous step, user context, or an external source.

Agent

The model, tool permission, and runtime mode required by the task.

Output

The validated structure that downstream steps will use.

Error Rule

Retry, wait, alternate route, or human approval.

workflow.yaml

workflow: sales-drop-analysis
run_mode: auditable
timeout: 8 minutes

steps:
  - name: fetch_sales_data
    agent: birk-agent-light
    tools: [run_sql]
    output: sales_summary
    on_error: retry 2 times

  - name: investigate_causes
    agent: birk-agent-heavy
    input: sales_summary
    output: evidence_based_findings

  - name: write_report
    agent: birk-fast
    input: evidence_based_findings
    output: executive_summary

03Data Flow and State

Instead of passing free-form text between steps, pipelines use named outputs. This makes each part readable, rerunnable, and tied to the data behind each decision.

Raw Data

sales records

campaign calendar

customer segments

Intermediate State

clean table

anomaly list

evidence links

Final Output

summary

causes

recommended actions

04Error Handling and Reliability

Real workflows can face network errors, denied tool permissions, missing data, or weak model output. The pipeline makes these cases visible and manageable.

Retry

Transient failures are retried a limited number of times with increasing delay.

Alternate Route

If a tool is unavailable, a fallback source, narrower task, or human approval can take over.

Quality Gate

Empty, unsupported, or schema-invalid outputs are stopped before reaching the next step.

error rules

error_rules:
  timeout: stop_step
  transient_network_error: retry 3 times
  permission_denied: request_human_approval
  schema_mismatch: fix_output_and_validate_again
  critical_action: hold_in_safe_mode

05Capability Catalog

Capabilities are reusable skill packages that agents can use. A capability is not just a list of tools; it also defines permission scope, expected output, and security boundaries.

data_analyst

Writes safe queries, reads table schemas, and extracts numerical findings.

run_sqldescribe_tablesummarize_results

log_inspector

Scans application logs and flags outage or anomaly patterns.

search_logsselect_time_rangefind_error_clusters

document_reader

Reads documents and knowledge-base chunks, then produces evidence-backed answers.

read_filevector_searchcite_sources

action_executor

Sends approved actions to external systems and records the result.

open_tasksend_notificationupdate_status

06Observability

Every run leaves a trace. It should be possible to answer which step took how long, which tool was called, which source was used, and which decision was produced.

Run trace

00:00

Request received

user intent and context recorded

00:02

Data collected

3 sources read, 1 table queried

00:11

Analysis completed

4 findings and 2 anomalies flagged

00:15

Report generated

evidence and actions added

Recorded metrics

Duration

15.4 sec

Tool calls

Retries

Quality score

94/100

Human approval

not required

Cost

per step

07Production Checklist

CHECK 01

Each step's input and output schema must be clearly defined.

CHECK 02

Tool-using steps must keep permissions at the narrowest possible scope.

CHECK 03

Retries must be limited, delayed, and observable.

CHECK 04

Outputs from parallel steps must pass through a single validation step.

CHECK 05

Critical actions must include human approval or a safe operating mode.

CHECK 06

Trace, cost, duration, and used tools must be recorded for every run.

Security

Permissions are limited at the step level.

Audit

Sources and runtime trace are stored for every decision.

Rerun

A failed step can be retried on its own.

08Next Steps

Orchestration Layer

See how pipeline steps are distributed to models and agents.

Read

API Reference

Start pipeline runs through the API, monitor them, and read their results.

Read