Skip to main content

Workflow Run Observability

Every workflow run produces a rich stream of events, metrics, and diagnostics. The run detail page provides four views into execution — a real-time message timeline, context window analysis, a live network visualization, and post-run metrics.

Run Detail Tabs

Messages Timeline

The default view. Every event in the workflow renders chronologically as the run progresses — agent messages, tool executions, stage transitions, user interactions, and system events.

A filter bar across the top lets you toggle event categories:

FilterWhat it shows
ConversationsAgent-to-agent messages
UserUser messages and interrupts
WorkflowStage transitions, workflow lifecycle events
ToolsTool executions with parameters and results
StatesAgent state transitions and task assignments

Filter preferences persist across page reloads.

Stage dividers between stages are interactive. On completed, stopped, or errored runs, hovering over a stage divider reveals a Re-run from here button that creates a new run starting at that stage with inherited context from the original run. See Re-running from a Stage for details.

A sidebar alongside the timeline shows stage results (downloadable as markdown) and any documents produced during the run.

The info bar at the top displays the run status, workflow name, team, and a live token usage counter per agent — updated incrementally as the run progresses.

Context Window

A time-series chart showing how each agent's context window grows over the course of the run.

  • X-axis: Elapsed time since run start
  • Y-axis: Context size in tokens
  • One line per agent — solid lines show compacted context, dotted lines show what the context would be without compaction
  • Vertical markers at stage transitions show when the workflow moved between stages

This view makes compaction effectiveness visible at a glance. A large gap between the solid and dotted lines means compaction is saving significant tokens. A context line that grows without bound suggests the agent's compaction preset may need adjustment.

Live Network

A real-time graph visualization of the workflow structure. As the run progresses, nodes highlight and pulse to show activity:

  • Stages highlight when active, dim when complete
  • Agent assignments pulse when the agent takes a turn
  • Tools pulse when executed
  • Hooks pulse when they fire

Activity pills float above active nodes showing brief parameter or message snippets, then fade after a few seconds. The graph supports pan and zoom for complex workflows.

Metrics

Available after a run completes. Three levels of detail:

Run summary — Key metrics at a glance:

  • Duration, stage count, total LLM calls, input/output tokens, cached tokens, tool calls, tool failures, compaction tokens

Per-stage breakdown — For each stage:

  • Stage name, outcome, duration, cycle count
  • LLM calls, token usage, tool calls

Per-agent breakdown — Within each stage:

  • Iterations, messages sent, LLM calls, token split (input/output/cached)
  • Tool breakdown: which tools were called how many times, with failure counts

Real-Time Events — Tracing

Events stream to the browser over a WebSocket connection. Each event is persisted as it arrives, so the timeline builds up in real time during execution.

The complete event stream constitutes a trace of the workflow run — the full execution path of every agent turn, tool call, stage transition, and routing decision. This is the industry-standard concept of tracing applied to multi-agent orchestration: you can reconstruct exactly what happened, in what order, and why, for any run.

If you open a run that's already in progress (or revisit a completed run), any events you missed are automatically backfilled — the timeline is always complete.

Event Types

Events fall into two categories:

Rendered events — Displayed in the messages timeline:

EventWhat it represents
Agent messageAn agent sent a message
User messageThe user responded to an agent's question
Tool executionA tool was called with parameters and returned results
Stage transitionA stage started, completed, or changed state
Agent assignedAn agent was assigned to begin work
Workflow started / completedRun lifecycle boundaries
workflow_paused / workflow_resumedUser paused or resumed execution
stage_result_storedA stage produced its output
hook_completed / hook_errorAn entry or exit hook fired

Data events — Persisted but not rendered:

EventWhat it captures
llm_usageToken counts per LLM call (feeds the context chart and token counters)

User Interaction During Runs

When an agent with Communicates with User enabled asks a question, an input form appears at the bottom of the messages timeline. The user types a response and it is delivered to the agent in real time.

Additional controls during execution:

ActionWhat it does
PauseSuspends the workflow after the current agent turn completes
Resume with instructionContinues a paused workflow with an additional message injected
StopTerminates the workflow immediately

Run Progression Analysis

When viewing the run history for a specific workflow, ORQO compares consecutive completed runs to surface performance trends.

For each pair of runs, the system computes deltas across all metrics and classifies them:

  • Improvements (green) — Metrics that moved in the right direction (lower cost, fewer tokens, faster completion)
  • Regressions (red) — Metrics that worsened

If the workflow configuration changed between runs (different version), the comparison includes a config diff showing what was modified — so you can correlate performance changes with specific edits.

The top improvements and regressions are ranked by magnitude. Visual weight scales with importance — the most significant changes are the most prominent.

Recovery

ScenarioWhat happens
Browser disconnectedEvents are backfilled when you reopen the run
Platform restartedOrphaned runs are detected and marked as errored
Run stuckA background reaper catches runs in "running" state for more than 30 minutes and marks them as failed