Annotations & Queues
Annotations turn traces or conversations into labeled data; queues organize that labeling work for reviewers.
What users create
- Annotation Queue: A workspace for reviewers. You pick whether items are traces or conversations and define the questions they must answer.
- Questions: The prompts reviewers see. Types supported: Freeform, Boolean, Multiple Choice, Single Choice, Numeric. You can add helper text, placeholders, required/optional, defaults, and min/max for numeric.
- Answers: Reviewer submissions tied to a queue item (trace or conversation), stored per question with the appropriate value type.
Why it matters
- Produces consistent human labels for evaluations and experiments.
- Keeps review load organized (who labels what, and what’s left).
- Enables regression checks after model/prompt changes using the same queues.
How it works in Arcane
- Create an Annotation Queue (Traces or Conversations).
- Add Questions to the queue’s template (choose type, options, requirements).
- Enqueue items (traces or conversations) into the queue.
- Reviewers answer the questions for each item; answers are saved as annotations.
- Use annotated items in evaluations/experiments or export them for offline analysis.
Fields you’ll see
- Queue: name, description, type (Traces or Conversations).
- Question: text, helper, placeholder, type, options (for choice types), required, default, min/max (numeric).
- Answers: captured per question in the matching value type (string, boolean, number, array for multi-choice).
Good practices
- Ask one clear intent per question (e.g., “Is the answer factually correct?”).
- Prefer choice types for consistency; use freeform for reviewer notes.
- Keep queue scopes focused (per feature or per release) to avoid stale items.
- Reuse queues to compare before/after changes to prompts or models.
Related concepts
- Pair queues with Scores and Evaluations to track quality over time.
- Use Conversations queues when you want full-session review; use Traces queues for single execution review.