Documentation Menu

Workflow and Capabilities

Break complex work into small, auditable, and repeatable steps. Agents gather data, use tools, validate intermediate results, and produce a traceable final output.

01Flow Types

A pipeline is not limited to a single straight line. Sequential, parallel, conditional, and iterative patterns can be combined based on the dependencies of the work.

Sequential Flow

Each step uses the output of the previous one. Used for dependent processes such as data preparation, analysis, and report generation.

Fetch data
Clean
Analyze
Report
The safest default pattern; easy to debug.

Parallel Fan-Out

Independent subtasks are distributed to different agents at the same time. Useful for breaking large work apart without waiting.

SQL query
Log scan
Document reading
Merge results
Total duration approaches the duration of the slowest subtask.

Conditional Branching

Different routes are selected based on a step's output. Risk, score, data quality, or user approval can become the decision point.

Evaluate
Check threshold
Select route
Apply
Because the rules are explicit, it creates an auditable decision flow.

Iterative Review

Improvement steps run until the output passes a quality threshold. Effective for reports, code review, and classification work.

Generate
Check
Fix
Approve
A retry limit is required to prevent infinite loops.

02Step Anatomy

A production-ready step does more than run a command. It explicitly carries what it accepts, what it produces, which tool it may use with which permission, and what to do when something fails.

01

Purpose

The step's one-sentence goal and success criteria.

02

Input

Data from the previous step, user context, or an external source.

03

Agent

The model, tool permission, and runtime mode required by the task.

04

Output

The validated structure that downstream steps will use.

05

Error Rule

Retry, wait, alternate route, or human approval.

workflow.yaml
workflow: sales-drop-analysis
run_mode: auditable
timeout: 8 minutes

steps:
  - name: fetch_sales_data
    agent: birk-agent-light
    tools: [run_sql]
    output: sales_summary
    on_error: retry 2 times

  - name: investigate_causes
    agent: birk-agent-heavy
    input: sales_summary
    output: evidence_based_findings

  - name: write_report
    agent: birk-fast
    input: evidence_based_findings
    output: executive_summary

03Data Flow and State

Instead of passing free-form text between steps, pipelines use named outputs. This makes each part readable, rerunnable, and tied to the data behind each decision.

Raw Data

sales records
campaign calendar
customer segments

Intermediate State

clean table
anomaly list
evidence links

Final Output

summary
causes
recommended actions

04Error Handling and Reliability

Real workflows can face network errors, denied tool permissions, missing data, or weak model output. The pipeline makes these cases visible and manageable.

Retry

Transient failures are retried a limited number of times with increasing delay.

Alternate Route

If a tool is unavailable, a fallback source, narrower task, or human approval can take over.

Quality Gate

Empty, unsupported, or schema-invalid outputs are stopped before reaching the next step.

error rules
error_rules:
  timeout: stop_step
  transient_network_error: retry 3 times
  permission_denied: request_human_approval
  schema_mismatch: fix_output_and_validate_again
  critical_action: hold_in_safe_mode

05Capability Catalog

Capabilities are reusable skill packages that agents can use. A capability is not just a list of tools; it also defines permission scope, expected output, and security boundaries.

data_analyst

Writes safe queries, reads table schemas, and extracts numerical findings.

run_sqldescribe_tablesummarize_results
log_inspector

Scans application logs and flags outage or anomaly patterns.

search_logsselect_time_rangefind_error_clusters
document_reader

Reads documents and knowledge-base chunks, then produces evidence-backed answers.

read_filevector_searchcite_sources
action_executor

Sends approved actions to external systems and records the result.

open_tasksend_notificationupdate_status

06Observability

Every run leaves a trace. It should be possible to answer which step took how long, which tool was called, which source was used, and which decision was produced.

Run trace
00:00
Request received
user intent and context recorded
00:02
Data collected
3 sources read, 1 table queried
00:11
Analysis completed
4 findings and 2 anomalies flagged
00:15
Report generated
evidence and actions added
Recorded metrics
Duration
15.4 sec
Tool calls
6
Retries
1
Quality score
94/100
Human approval
not required
Cost
per step

07Production Checklist

CHECK 01

Each step's input and output schema must be clearly defined.

CHECK 02

Tool-using steps must keep permissions at the narrowest possible scope.

CHECK 03

Retries must be limited, delayed, and observable.

CHECK 04

Outputs from parallel steps must pass through a single validation step.

CHECK 05

Critical actions must include human approval or a safe operating mode.

CHECK 06

Trace, cost, duration, and used tools must be recorded for every run.

Security

Permissions are limited at the step level.

Audit

Sources and runtime trace are stored for every decision.

Rerun

A failed step can be retried on its own.