Skip to main content
Dashboard widgets are visualization components that display evaluation data in different formats. This guide provides an overview of all available widget types.

Available Widget Types

Data-Driven Widgets

These widgets require queries and compute results from evaluation data:

Metric Widget

Display a single computed value like averages, counts, or percentiles

Table Widget

Show multi-dimensional data in rows and columns with grouping, sorting, and conditional formatting

Bar Chart

Visualize comparisons across categories with bar charts

Histogram

Display data distributions across numerical ranges

Donut Chart

Show proportional distribution of categories as a donut/pie chart

Scatter Plot

Explore correlations between two numerical dimensions

Timeseries

Track metrics over time with line charts

Static Content Widgets

These widgets display static content without queries:

Markdown Widget

Add rich text documentation, instructions, and notes

Heading Widget

Organize dashboards with section headers

Quick Reference

Widget TypePurposeRequires QueryBest For
MetricSingle aggregation valueYesTop-level metrics, summary stats
TableTabular data displayYesDetailed breakdowns, grouped data
BarBar chart visualizationYesCategorical comparisons
HistogramDistribution visualizationYesUnderstanding distribution of data
DonutProportional distributionYesCategory breakdowns, percentage views
ScatterX-Y relationship visualizationYesCorrelations, outlier detection
TimeseriesTrends over timeYesTracking metrics across evaluation runs
MarkdownRich text contentNoDocumentation, instructions, notes
HeadingSection headersNoDashboard organization

Choosing the Right Widget Type

Display a Single Number

Use a Metric widget when you need to show one computed value prominently:
  • Overall average score
  • Total item count
  • Pass rate percentage
  • 95th percentile latency

Show Detailed Breakdowns

Use a Table widget when you need to display multiple dimensions and metrics:
  • Performance by model and category
  • Top 10 highest/lowest scoring items
  • Multi-column statistical summaries

Compare Categories

Use a Bar Chart widget to visualize how values differ across groups:
  • Score by category
  • Model performance comparison
  • Items per status

Visualize Distributions

Use a Histogram widget to understand how values are distributed:
  • Score distribution
  • Latency distribution
  • Confidence level spread

Show Proportional Breakdowns

Use a Donut Chart widget to visualize how categories are distributed:
  • Task type distribution
  • Agent workload share
  • Prompt category breakdown

Explore Correlations

Use a Scatter Plot widget to examine relationships between two metrics:
  • Accuracy vs. relevance
  • Score vs. response length
  • Latency vs. quality
Use a Timeseries widget to monitor metrics across evaluation runs:
  • Score progression across evaluations
  • Multi-metric trends in evaluation groups
  • Performance tracking over time

Add Documentation

Use Markdown widgets for context and instructions:
  • Dashboard methodology
  • Alert thresholds
  • Team contact information
  • Analysis guidelines

Organize Sections

Use Heading widgets to structure your dashboard:
  • “Overview Metrics”
  • “Detailed Analysis”
  • “Performance by Category”

Widget Configuration Overview

All widgets share common fields: Required Fields:
  • title: Display name for the widget
  • type: One of: metric, table, bar, histogram, donut, scatter, timeseries, markdown, heading
Query Field (for data-driven widgets):
  • query: QueryAST object defining the computation
Optional Fields:
  • config: Widget-specific display configuration

Data Sources

Data-driven widgets can query two sources:
  • data.*: User-defined evaluation data (your custom fields)
  • task_result_cache.*: Task execution results
Both support nested fields using dot notation (e.g., data.metadata.model_version).

Reusing Widgets

Widgets can be shared across multiple dashboards. When you include the same widget ID in different dashboards’ widget_order, they all use the same widget definition but produce different results based on their evaluation context.
When reusing widgets, make sure the shape of the data for each evaluation is the same. Otherwise, the widget will not be able to compute the result.

Next Steps

Explore detailed documentation for each widget type: Or continue learning: