Types
type | Stimuli | When to use it |
|---|---|---|
SSE | Single Stimulus Evaluation — N independent designs (designUrls[]). | ”Rate the new checkout.” |
SxS | Side-by-Side Comparison — N pairs (comparisonPairs[]). | ”Is v2 better than v1?” |
Lifecycle
evaluation.status_changed webhook (see
events).
| Status | Meaning |
|---|---|
Draft | Wizard state. Editable. No autousers queued, no ratings allowed. |
Running | Autousers may be queued. Public share link is live. |
Ended | Closed for new ratings. Results are read-only. |
Creating an evaluation
The minimum viable SSE evaluation:dryRun: true first to validate and price the run without
committing. See Quickstart.
The response includes a links object with absolute URLs to the
preview, review, edit, results, and public share pages — surface these
in your UI rather than constructing URLs yourself.
Running the autousers
Creating an evaluation does not queue autousers — that’s a separate call so you can stage a draft without spend.Running (if it was Draft),
expands selectedAutousers by agentCount into individual
AutouserRun rows, and enqueues them on the GKE worker. Each run takes
1–6 minutes depending on the design complexity.
Subscribe to the autouser_run.completed webhook for completion. Or
poll GET /v1/evaluations/{id}/autouser-status for an aggregate
snapshot. Or open GET /v1/evaluations/{id}/autouser-stream for an SSE
event stream.
Reading results
/results for dashboards (it’s already aggregated). Use /ratings
for warehouse sync (full row-level data, paginate).
Sharing
Every evaluation has a public share token. Sharing modes:shareAccess | Behaviour |
|---|---|
TEAM_ONLY | Default. Only team members can view. |
LINK_ONLY | Anyone with the share URL can rate. |
PASSWORD_PROTECTED | Requires sharePassword (≥4 chars). |
EMAIL_GATED | Public raters supply name/email before rating. |
Deletion
See also
- Autousers — how the personas queue, run, calibrate.
- Templates — what gets rated.
- Ratings & agreement — the scoring shape.