<- Production AI Graph
model record
Claude Sonnet 4.6
Highest human oversight trigger accuracy in the current cohort. Observability logging incomplete under high-load simulation. Consistent refusal behaviour.
Confidence
82%
Sources
2
Events
1
Observed
30 Apr 2026
Public record summary
Highest human oversight trigger accuracy in the current cohort. Observability logging incomplete under high-load simulation. Consistent refusal behaviour.
D1D2D3D4D6D5D7D8
Related events
- Claude Sonnet 4.6 — Q2 2026 Lab benchmark
30 Apr 2026 | benchmark | 82%
Assessments
| Assessment | Type | Confidence |
|---|---|---|
| Claude Sonnet 4.6 — PSF scorecard | scorecard | 82% |