AutoResearch
StaleKeptMedium bandNews Digest
Reviewer model profile for digest judging
- Baseline
- 66%
- Final
- 82%
- Delta
- +16 pts
- Variants
- 3
Objective
What we set out to improve
Catch low-quality digest items earlier with a dedicated reviewer profile, without slowing the daily run.
KeptPromoted to a templateWrote to a KB
Kept. The reviewer-plus-source-health variant raised accepted-item quality from 0.66 to 0.82 at a medium cost and now runs on the daily digest by default.
Iterations
Variants we tried
Each variant and its coarse objective metric. The kept variant is marked; bars are relative to the best run.
- 1Baseline — no dedicated reviewerLow66%
- 2Variant A — lightweight reviewer passMedium78%
- 3Variant B — reviewer + source-health gateWinnerMedium82%
Run
Stages
baseline
Succeeded · 3.0s
variant run
Succeeded · 8.2s
eval
Succeeded · 1.3s
promote
Succeeded · 280ms
Output
Artifacts and what shipped
Redaction-safe artifact previews, diffs, metric tables, and prompt variants with sensitive text removed.
- Metric table
Accepted-item quality by variant (0.66 → 0.82)
- Prompt variant
Reviewer profile variant (sensitive text removed)
- KB write
Stored reviewer profile in news-sources KB