Skip to main content

Documentation Index

Fetch the complete documentation index at: https://docs.autousers.ai/llms.txt

Use this file to discover all available pages before exploring further.

A Rating is a single rater’s verdict on a single comparison. Humans and autousers produce the same shape so downstream analytics doesn’t have to branch on raterType.

Shape

{
  "id": "rat_clxq3...",
  "evaluationId": "eval_clxq3...",
  "comparisonId": "cmp_clxq3...",
  "raterType": "human",
  "userId": "usr_clxq3...",
  "publicRaterId": null,
  "autouserId": null,
  "autouserRunId": null,
  "rubricVersion": "v3",
  "dimensionRatings": {
    "overall": 4,
    "trust": 3,
    "clarity": 5
  },
  "openTextResponses": {
    "overall": "Felt fast but the trust signals were thin.",
    "trust": "No security badge, no review count."
  },
  "factors": null,
  "justification": "Liked the simplicity, missed the social proof.",
  "skipReason": null,
  "timeSpentSeconds": 87,
  "timingData": {
    /* ... */
  },
  "createdAt": "2026-05-04T10:21:08.123Z"
}
Discriminator:
FieldSet when
userIdAuthenticated user submitted this rating.
publicRaterIdAnonymous public rater (no account).
autouserIdAn autouser run produced it.
autouserRunIdThe specific run row.
Exactly one of (userId, publicRaterId) is set on human ratings; exactly one of (autouserId, autouserRunId) is set on autouser ratings.

Listing

curl "https://app.autousers.ai/api/v1/evaluations/$EVAL_ID/ratings?limit=100" \
  -H "Authorization: Bearer $AUTOUSERS_API_KEY"
Cursor-paginate with starting_after. See Pagination.

Submitting a human rating via the API

Most ratings come from the dashboard or the public share link. If you need to submit one programmatically (e.g. wiring up a custom rater UI):
curl -X POST https://app.autousers.ai/api/v1/evaluations/$EVAL_ID/ratings \
  -H "Authorization: Bearer $AUTOUSERS_API_KEY" \
  -H "Content-Type: application/json" \
  -H "Idempotency-Key: $(uuidgen)" \
  -d '{
    "comparisonId": "cmp_clxq3...",
    "dimensionRatings": { "overall": 4, "trust": 3, "clarity": 5 },
    "justification": "Smooth flow, weak trust signals.",
    "timeSpentSeconds": 87
  }'

Agreement

Once you have ratings from ≥3 raters per comparison, agreement metrics become useful. The /agreement endpoint computes Krippendorff α and, when there are exactly two raters, Cohen κ.
curl https://app.autousers.ai/api/v1/evaluations/$EVAL_ID/agreement \
  -H "Authorization: Bearer $AUTOUSERS_API_KEY"
{
  "krippendorff": { "alpha": 0.74, "n_raters": 6, "n_items": 4 },
  "byDimension": {
    "overall": { "alpha": 0.81 },
    "trust": { "alpha": 0.62 },
    "clarity": { "alpha": 0.79 }
  },
  "ratingCount": 24,
  "cachedAt": "2026-05-04T11:02:13.000Z"
}

What the numbers mean

α rangeReading
α < 0.4No agreement. Treat results as anecdote, not signal.
0.4 ≤ α < 0.6Weak. Useful directional, not for promotion gating.
0.6 ≤ α < 0.8Acceptable. Most teams ship gates at α ≥ 0.6.
α ≥ 0.8Strong. Suitable for automated CI gates.
Cohen κ uses the same scale. When >2 raters are present we report only Krippendorff (κ is undefined for >2 raters).

Caching

Agreement is cached on Evaluation.agreementCache and only recomputed when the rating count changes. The first call after a new rating is slightly slower (~100ms) as it warms the cache; subsequent calls are instant.

Streaming ratings into a warehouse

The shape is stable — dimensionRatings is a JSON map, factors and openTextResponses are JSON. Subscribe to the rating.created webhook (see Events) and append rows to BigQuery / Snowflake as they arrive. Use Autousers-Event-Id as the dedup key on insert. See the Looker / BigQuery recipe.