Datasets

Datasets are first-class collections of items you curate for evaluations and experiments. Items can come from traces or from any other source you import—but they don’t have to be tied to tracing.

What they are

Saved sets of items with a table-like structure: a header (column names) and rows (values).
Managed in the Datasets page (create, import, edit, delete).
Used directly by Evaluations and Experiments; no extra wiring needed.

Why it matters

Stable, repeatable inputs for scoring and A/B comparisons.
Lets you rerun the same set after prompt/model changes to spot regressions.
Can feed annotation queues when you need labeled ground truth on a known set.

How it works in Arcane

Create a dataset (name, description; items can be anything you import or select, including from traces).
Add items via import or the dataset builder (you can pull from trace searches, but it’s optional).
(Optional) Send items to an annotation queue for labeling.
Run Evaluations or Experiments against the dataset; reuse it for consistent baselines.

Fields you’ll see

Name, description.
Header: column names.
Rows: values aligned to the header.

Annotations & Queues: label traces and conversations and derive datasets for ground truth.
Scores & Evaluations: run metrics against a dataset to track quality.
Experiments: run prompt versions against a dataset; compare variants for fair wins/losses. See Experiments.

What they are​

Why it matters​

How it works in Arcane​

Fields you’ll see​

Related concepts​

What they are

Why it matters

How it works in Arcane

Fields you’ll see

Related concepts