Getting Started with Dashboards

This guide walks you through creating your first evaluation dashboard from scratch, adding widgets, and organizing your layout.

Using Sample Data

To follow along with this tutorial, you can use our sample evaluation dataset containing 40 agent evaluation items with realistic scores and metadata. Upload the sample data to your account and create an evaluation with it. Download Sample Data:

Download sample-evaluation.csv

Sample dataset with 40 evaluation items across 4 agents (GPT-4, Claude-3, Gemini-Pro, Llama-3)

Data Structure: The sample data contains evaluation items with this structure:

{
  "id": "eval_001",
  "agent_name": "GPT-4-Turbo-Agent",
  "agent_version": "1.0",
  "judged_evaluation": {
    "overall_score": 87,
    "accuracy_score": 92,
    "relevance_score": 85,
    "coherence_score": 89,
    "helpfulness_score": 84,
    "fluency_score": 91
  },
  "timestamp": "2026-01-15T10:30:00Z",
  "task_type": "question_answering",
  "prompt_category": "technical",
  "response_length": 256,
  "model_temperature": 0.7
}

Key Fields:

agent_name: Model being evaluated (GPT-4-Turbo-Agent, Claude-3-Sonnet-Agent, etc.)
*: Nested scores (overall_score, accuracy_score, relevance_score, coherence_score, helpfulness_score, fluency_score)
task_type: Type of task (question_answering, summarization, code_generation, analysis, translation, creative_writing)
prompt_category: Category (technical, general, business, language, creative)

How to Use:

Download the JSON file
Create a new evaluation via the API or SDK:

from scale_gp_beta import SGPClient
import json

# Using api.dev-gp.scale.com
client = SGPClient(
    api_key="your-api-key",
    account_id="your-account-id",
    environment="development"
)

# Load sample data
with open('sample-evaluation.csv', 'r') as f:
    sample_items = json.load(f)

# Create evaluation with sample data
evaluation = client.evaluations.create(
    name="Agent Performance Comparison",
    data=sample_items
)

print(f"Created evaluation: {evaluation.id}")

The client uses the environment parameter to connect to different Scale GP deployments. Available options: "production", "production-multitenant", "development", "staging", "local". For custom endpoints, use base_url instead.

Follow the rest of this guide to create dashboards and widgets using this evaluation

The examples throughout this guide reference fields from this sample dataset. If using your own data, adjust the column names accordingly.

Prerequisites

Before creating a dashboard, you need either:

An existing evaluation with completed results, OR
An evaluation group containing multiple evaluations (Coming soon)

If you don’t have an evaluation yet, see Next Gen Evaluation Getting Started to create one.

Step 1: Create a New Dashboard

Via the UI

Navigate your version of SGP (Dev SGP)
Make sure the evaluation-dashboards-enabled feature flag is enabled for your account
1. (Instructions to enable the feature flag)
Click the “Dashboards” tab
Click the “New Dashboard” button

Fill in the dashboard details:
- Name: Give your dashboard a descriptive name (e.g., “Model Performance Overview”)
- Description: Optional description explaining the dashboard’s purpose
- Tags: Optional tags for organization and filtering
- Evaluation: Select the evaluation you want to create a dashboard for
Click “Create” to save your dashboard

Via the SDK

from scale_gp_beta import SGPClient

client = SGPClient(
    api_key="your-api-key",
    account_id="your-account-id",
    environment="development"
)

# Create dashboard for a single evaluation
dashboard = client.evaluation_dashboards.create(
    name="Demo Dashboard",
    description="A demo dashboard for the demo evaluation",
    evaluation_id="eval-123",
    tags=["demo", "documentation"]
)

Let’s add a metric widget to display the average score across all evaluation items.

Via the UI

From your dashboard page, click “Add Widget”
Select “Query Value” as the widget type
Configure the widget:
- Title: “Average Score”
- Query: Select the average of the “score” column
Click “Add”

Widget results are automatically computed when you create or update a widget. The response includes both the widget configuration and the computed result.

Via the API

# Add a metric widget showing average score
widget = client.evaluation_dashboards.widgets.create(
    dashboard_id=dashboard.id,
    title="Average Score",
    type="metric",
    query={
        "select": [
            {
                "expression": {
                    "type": "AGGREGATION",
                    "function": "AVG",
                    "column": "overall_score",
                    "source": "data"
                }
            }
        ]
    }
)

print(f"Computed result: {widget.result.computed_result}")
# Output: {'type': 'metric', 'data': 0.873}

Now let’s add a bar chart to show score distribution across different models.

Via the UI

Click “Add Widget” again
Select “Bar Chart” as the widget type
Configure the widget:
- Title: “Score by Agent”
- Group By: Select “agent_name”
- Under the Advanced Options
  - Add an aggregation, select “Average” on “overall_score”
Click “Add”

Via the SDK

# Add a bar chart widget showing average score by category
widget = client.evaluation_dashboards.widgets.create(
    dashboard_id=dashboard.id,
    title="Score by Agent",
    type="bar",
    query={
        "select": [
            {
                "expression": {
                    "type": "COLUMN",
                    "column": "agent_name",
                    "source": "data"
                }
            },
            {
                "expression": {
                    "type": "AGGREGATION",
                    "function": "AVG",
                    "column": "overall_score",
                    "source": "data"
                }
            }
        ],
        "groupBy": ["agent_name"]
    },
    config={
        "x_column": "agent_name"
    }
)

Step 4: Add Section Headers

Use heading widgets to organize your dashboard into logical sections.

Via the UI

Click “Add Widget”
Select “Heading” as the widget type
Configure the widget:
- Title: “Graphs”
Click “Add”

Via the SDK

# Add a heading widget
heading = client.evaluation_dashboards.widgets.create(
    dashboard_id=dashboard.id,
    title="Graphs",
    type="heading"
)

Step 5: Organize and Configure Layout

Reorder Widgets

Arrange widgets in your preferred order by dragging and dropping in the UI, or update the widget order via the API:

# Reorder widgets - first widget appears at top
client.evaluation_dashboards.update(
    dashboard_id=dashboard.id,
    widget_order=[heading.id, widget1.id, widget2.id]
)

Next Steps

Learn about all available Widget Types
Master the Query Language for advanced filtering and aggregations
Explore API Reference for programmatic query creation

Overview

Getting Started

Evaluations

Evaluation Dashboards

Tracing

Agents

Getting Started with Dashboards

Using Sample Data

Download sample-evaluation.csv

Prerequisites

Step 1: Create a New Dashboard

Via the UI

Via the SDK

Step 2: Add Your First Widget (Metric)

Via the UI

Via the API

Step 3: Add a Chart Widget (Bar Chart)

Via the UI

Via the SDK

Step 4: Add Section Headers

Via the UI

Via the SDK

Step 5: Organize and Configure Layout

Reorder Widgets

Next Steps

Overview

Getting Started

Evaluations

Evaluation Dashboards

Tracing

Agents

​Using Sample Data

Download sample-evaluation.csv

​Prerequisites

​Step 1: Create a New Dashboard

​Via the UI

​Via the SDK

​Step 2: Add Your First Widget (Metric)

​Via the UI

​Via the API

​Step 3: Add a Chart Widget (Bar Chart)

​Via the UI

​Via the SDK

​Step 4: Add Section Headers

​Via the UI

​Via the SDK

​Step 5: Organize and Configure Layout

​Reorder Widgets

​Next Steps

Using Sample Data

Prerequisites

Step 1: Create a New Dashboard

Via the UI

Via the SDK

Step 2: Add Your First Widget (Metric)

Via the UI

Via the API

Step 3: Add a Chart Widget (Bar Chart)

Via the UI

Via the SDK

Step 4: Add Section Headers

Via the UI

Via the SDK

Step 5: Organize and Configure Layout

Reorder Widgets

Next Steps