Prompts & Scores

Prompts and scores are the building blocks for testing and measuring AI quality. Prompts are versioned templates for model calls used in experiments. Scores are reusable metrics used in evaluations to measure quality.

What it is

Prompts let you:

Create versioned templates for model calls (system/user messages, parameters).
Track changes over time with versioning and diffs.
Use prompts in experiments to test different versions against datasets.

Scores let you:

Define reusable metrics with scoring type Numeric, Ordinal, Nominal, or RAGAS (a scoring framework); optionally attach an evaluator prompt for LLM-as-judge.
Use scores in evaluations to measure quality of datasets or experiment results.
Track quality consistently across different runs.

Together, prompts generate outputs (via experiments) and scores measure those outputs (via evaluations). See Prompts, Models & Scores for the underlying concepts.

What you can do

Area	What you do
Prompts	Create prompts with versions, edit to create new versions, compare versions, promote versions. Use prompts in experiments.
Scores	Create scores with name, scoring type (Numeric, Ordinal, Nominal, or RAGAS—a scoring framework), optional scale options, and optional evaluator prompt for LLM-as-judge. Use scores in evaluations.

Getting started

Register model configurations (Organisation Configuration → AI Models) so prompts can run.
Create a prompt — Build a versioned template with system/user messages and parameters.
Create scores — Define metrics you want to measure (e.g. correctness, relevance, safety).
Use prompts in Experiments and scores in Evaluations.

Pages in this section

Prompts — Create, version, and manage prompts for experiments.
Scores — Define metrics for evaluations.

Prompts, Models & Scores — Core concepts for prompts and scores.
Experiments — Use prompts in experiments.
Evaluations — Use scores in evaluations.

What it is​

What you can do​

Getting started​

Pages in this section​

Related​

What it is

What you can do

Getting started

Pages in this section

Related