Skip to main content

Project Insights

The Projects Overview page is a dashboard that surfaces operational health across all of your projects. Rather than requiring you to open each project individually, it aggregates activity, cost, and performance data into a single view with at-a-glance indicators.

Projects Overview Page

Navigate to Projects from the main sidebar. The page displays all projects as cards in a responsive grid, sorted with pinned projects first, then alphabetically by name.

If any workflow runs occurred in the past seven days, a weekly summary strip appears above the grid with aggregated platform-wide statistics.

Weekly Summary Strip

The summary strip consolidates the past seven days of activity across all projects into a single line.

MetricWhat it shows
RunsTotal number of workflow runs across all projects, and how many projects had activity
Success ratePercentage of runs that completed successfully (as opposed to erroring or being stopped)
Total costAggregate LLM cost for the week in dollars (only shown when cost tracking is active)
Cost trendPercentage change in cost compared to the previous seven-day period. Green indicates costs went down; red indicates costs went up
InsightsNumber of performance insights generated across all projects
RegressionsNumber of insights classified as regressions -- performance metrics that worsened compared to prior runs

The summary strip only appears when there is at least one non-pending run in the past week.

Project Cards

Each project is represented by a card that shows the project name, description, and several layers of information.

Activity Accent

The left border of each card is color-coded to indicate the project's current state:

ColorMeaning
YellowA workflow is currently running
GreenA workflow completed successfully in the last 24 hours
RedA workflow errored in the last 24 hours
GrayNo recent activity (idle)

The accent color reflects the most urgent state. If any workflow is actively running, the card shows yellow regardless of other recent outcomes.

Pinning

Click the star icon in the top-right corner of a card to pin or unpin a project. Pinned projects sort to the top of the grid, making it easy to keep your most important projects visible.

Version Badge

If any workflow in the project has versioned configuration snapshots, a version badge (e.g., "v3") appears next to the pin icon. This shows the latest workflow configuration version number across the project, giving you a quick sense of how actively the project's workflows are being iterated on.

Workflow Shortcuts

Below the project description, the card shows up to three recent workflow runs as clickable links. Each shortcut displays:

  • A status badge (the first letter of the status: C for completed, R for running, E for error, S for stopped)
  • The workflow name
  • Time since last update (e.g., "3 hours ago")

Clicking a shortcut navigates directly to that workflow's detail page.

Activity Stats (7-Day)

A compact stats line shows the project's activity over the past seven days:

StatDescription
RunsNumber of workflow runs in the past 7 days
Success ratePercentage that completed without error
CostTotal LLM cost for the period in dollars
Cost trendPercentage change versus the prior 7-day period. Green means cost decreased; red means cost increased

These stats mirror the platform-wide summary strip but are scoped to a single project.

Insight Headlines

When the platform detects a notable change in a project's workflow performance, it generates a ReflectionInsight -- a short headline describing what changed and by how much.

Each project card can display one insight headline: the highest-severity, most recent insight from the past seven days. The headline includes:

  • A severity badge indicating the magnitude of the change:
    • I (info) -- change under 10%
    • W (warning) -- change between 10% and 25%
    • C (critical) -- change of 25% or more
  • The headline text -- a human-readable summary of the metric change (e.g., "Duration dropped 40%")
  • Color coding -- green text for improvements, red for regressions
  • Time since detection

Health Indicators

At the bottom of each card, alongside the counts of teams, workflows, agents, and documents, two health indicators may appear:

  • Eff (Efficiency) -- Reflects trends in resource consumption metrics: duration, LLM calls, token usage, iterations, tool calls, and compaction tokens
  • Qual (Quality) -- Reflects trends in output quality metrics, primarily tool failure rates

Each indicator shows one of three states:

StateMeaningColor
improvingMore insights in this category were improvements than regressions over the past 7 daysGreen
regressingMore insights in this category were regressions than improvementsRed
stableEqual numbers of improvements and regressions, or no insights in this categoryGray

These two axes give you an immediate read on whether your workflows are trending in the right direction without needing to inspect individual runs.

Knowledge Graph Badge

If the project has Long-Term Memory enabled, a KG badge appears in the stats row, indicating that the Knowledge Curator is active for this project.

Tracked Metrics

ReflectionInsights and the progression system track specific operational metrics. Each metric falls into one of two categories.

Efficiency Metrics

Metrics where lower is better (reduced resource usage):

MetricWhat it measures
DurationTotal wall-clock time for the run
LLM CallsNumber of requests made to LLM providers
Input TokensTokens sent to the model
Output TokensTokens generated by the model
Tool CallsNumber of tool executions
IterationsNumber of agent thinking loops
CyclesNumber of agent turn cycles per stage
Compaction TokensTokens consumed by context compaction

One metric where higher is better:

MetricWhat it measures
Cached TokensTokens served from cache rather than recomputed -- more is better

Quality Metrics

MetricWhat it measures
Tool FailuresNumber of tool executions that returned errors

Run Progression View

For a detailed view of how a specific workflow's performance is evolving, open a workflow and navigate to the Progression tab. This view compares up to ten consecutive completed runs in chronological order.

How It Works

The system fetches execution metrics for each completed run, then compares consecutive pairs. For each pair of runs, it computes the delta for every metric at two levels:

  • Overall -- Run-level totals (total duration, total tokens, total tool calls, etc.)
  • Per-stage -- Stage-level metrics for stages that appear in both runs

Each delta is classified as an improvement or a regression based on the metric type (lower-is-better or higher-is-better), then ranked by magnitude. The top five improvements and top five regressions are displayed for each run pair.

Insight Summary

Above the run-pair comparisons, an insight summary shows the total count of improvements and regressions across all pairs. Each insight is displayed as a pill with its severity badge and headline text, color-coded green for improvements and red for regressions.

Run Pair Comparisons

Each comparison card shows:

  • Run identifiers -- The two run IDs and their timestamps, connected by an arrow
  • Improvement/regression counts -- A summary of how many metrics improved vs. regressed
  • Config diff (when applicable) -- If the workflow configuration was modified between the two runs (different version), the specific changes are displayed. This lets you correlate a prompt edit, agent swap, or stage reorder with the resulting metric shifts
  • Ranked insights -- Two columns listing the top improvements (green, left) and top regressions (red, right). Each insight shows:
    • The metric name and scope (overall or specific stage)
    • The old and new values
    • The percentage change, with visual weight scaled by rank -- the largest changes are the most prominent

Interpreting the Data

The progression view answers a specific question: "Is this workflow getting better or worse over time, and why?"

Common patterns:

PatternWhat it suggests
Consistent improvements after a config changeThe edit was effective -- the new prompt, agent, or stage order is performing better
Regressions following a config changeThe change had unintended consequences -- consider reverting or adjusting
Gradual regression with no config changesExternal factors may be at play -- changed input data, provider model updates, or accumulated context issues
Mixed improvements and regressionsA tradeoff was made -- for example, adding a review stage may increase duration but reduce tool failures
  • Observability -- Detailed run metrics, event tracing, and context window analysis
  • Running Workflows -- Manual execution, scheduling, and re-running from a stage
  • Projects -- Project setup, documents, and Long-Term Memory configuration