Skip to content
AutoResearch
StaleKeptMedium bandNews Digest

Reviewer model profile for digest judging

Baseline
66%
Final
82%
Delta
+16 pts
Variants
3
Objective

What we set out to improve

Catch low-quality digest items earlier with a dedicated reviewer profile, without slowing the daily run.

KeptPromoted to a templateWrote to a KB

Kept. The reviewer-plus-source-health variant raised accepted-item quality from 0.66 to 0.82 at a medium cost and now runs on the daily digest by default.

Iterations

Variants we tried

Each variant and its coarse objective metric. The kept variant is marked; bars are relative to the best run.

  • 1Baseline — no dedicated reviewerLow66%
  • 2Variant A — lightweight reviewer passMedium78%
  • 3Variant B — reviewer + source-health gateWinnerMedium82%
Run

Stages

  1. baseline

    Succeeded · 3.0s

  2. variant run

    Succeeded · 8.2s

  3. eval

    Succeeded · 1.3s

  4. promote

    Succeeded · 280ms

Output

Artifacts and what shipped

Redaction-safe artifact previews, diffs, metric tables, and prompt variants with sensitive text removed.

  • Metric table

    Accepted-item quality by variant (0.66 → 0.82)

  • Prompt variant

    Reviewer profile variant (sensitive text removed)

  • KB write

    Stored reviewer profile in news-sources KB

What you can see, and what is hidden

Every projection on this page is redaction-safe by construction. Redaction level: Sample content, curated, public-safe excerpts only.

Shown

  • Identifiers & counts
  • Closed-enum statuses
  • Coarse quality / resource bands
  • Timestamps & freshness

Intentionally hidden

  • Raw prompts
  • Raw documents
  • raw tool log
  • Raw trace spans
  • Embedding vectors
  • Free-text feedback
  • Auth internals & secrets
  • Secrets

Related in the Lab