Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
Insights / IncidentVendor reliability

Anthropic Claude API hit elevated errors across five frontier models on June 5, 2026

Anthropic's public status page records a same-day incident that degraded success rates on Claude Opus 4.5 through 4.8 and Sonnet 4.6 before full recovery. Production teams routing agent workloads through a single model slug need a failover plan when multiple tiers fail together.

Production AI Institute · 8 min read · Updated June 2026
Scope:Facts below come from Anthropic's June 5, 2026 status timeline. Anthropic had not published a detailed root-cause postmortem on the status page at the time this brief was drafted. Assessments on model capability are separate from this reliability note.

Short answer: On June 5, 2026 Anthropic reported elevated API errors affecting multiple Claude frontier models starting around 15:08 UTC. Recovery was staggered by model through 17:29 UTC, with the incident marked resolved at 18:28 UTC. If your production stack pins one Opus or Sonnet slug without health-checked fallbacks, a shared control-plane incident can stall every tier at once. Treat this as a PSF vendor-resilience signal, not a model-quality regression.

What changed

Anthropic opened investigation at 15:19 UTC on June 5, 2026 for elevated errors across many Claude models. The vendor identified a shared issue by 15:43 UTC, noted ongoing problems on Opus 4.7 and 4.8 through mid-afternoon, and reported success rates returning to expected levels by 17:13 UTC. The incident was resolved at 18:28 UTC.

Anthropic's final update states the incident started at 15:08 UTC and lists per-model recovery times: Opus 4.6 at 15:25 UTC, Sonnet 4.6 at 16:23 UTC, Opus 4.8 at 16:59 UTC, Opus 4.7 at 17:12 UTC, and Opus 4.5 at 17:29 UTC. Earlier in the week Anthropic logged separate smaller degradations on June 1 through June 3; the June 5 event is the broadest multi-model cluster in that window.

FieldDetail
Incident date2026-06-05 (UTC)
VendorAnthropic (Claude API, api.anthropic.com)
Affected modelsClaude Opus 4.5, 4.6, 4.7, 4.8; Claude Sonnet 4.6 (per status timeline)
Duration~15:08 UTC start to 18:28 UTC resolved (~3h 20m); per-model recovery staggered to 17:29 UTC
Primary sourcestatus.anthropic.com incident timeline
Affected teamsAPI integrators, agent harnesses, Claude Code automation, MSP-managed client stacks on Anthropic-only routes
PSF domainsobservability, security, testing (vendor resilience and deployment safety themes)

Why production teams should care

A multi-model outage breaks the common assumption that pinning a smaller Sonnet slug while keeping Opus for hard tasks provides isolation. On June 5, both tiers degraded in the same window. Agent runtimes that retry the same provider with a different model ID may still fail when the root cause sits in shared API infrastructure.

Teams that launched or upgraded to Claude Opus 4.8 after the May 28 release should separate model-capability evaluation from uptime evidence. Our Opus 4.8 assessment covers effort control and parallel subagents; this incident shows you still need cross-vendor failover and customer-visible degradation modes.

PSF control implications

Vendor resilience (Domain 8)

Single-vendor agent stacks experienced correlated failure across pricing tiers. See the D8 vendor resilience guide and PSF Domain 8 overview for failover patterns.

Practitioner action: Maintain a tested secondary provider route and circuit-break when Anthropic error rates exceed your SLO for two consecutive probe windows.

Observability (Domain 4)

Per-model recovery stagger means aggregate dashboards can look healthy while one pinned slug is still failing. Emit success-rate metrics tagged by model ID, not only by vendor label.

Practitioner action: Add synthetic completion probes per production model slug and page on divergence from baseline before public status moves to resolved.

Deployment safety (Domain 5)

Long-running agent sessions started before 15:08 UTC may have partially completed with silent retries. Replay idempotent tool steps after vendor incidents, do not assume mid-run state is consistent.

Practitioner action: Ship a feature-flag bundle that pauses non-essential agent tools and serves cached or human-reviewed responses for tier-1 journeys during vendor degradation.

What to do today

  1. Pull your June 5 API error logs by model slug and compare against Anthropic's per-model recovery timestamps.
  2. Rehearse failover to a secondary vendor or region on a staging agent harness; document time-to-recover.
  3. Update customer status comms templates with Anthropic component names your product actually calls.
  4. If you run Claude Code or Cowork in production paths, check whether security reviews or routines failed during the June 3 Claude Code service incident and June 5 API errors.
  5. File an internal incident record linking to this episode for quarterly vendor-risk review and SLA credit evaluation.

Where this fits in PAI

This event is distinct from model-launch assessments. For taxonomy and monitoring patterns, see seven failure modes of production AI, OpenAI May 2026 multi-service outage, and the Production AI Standard. MSP teams advising clients on Anthropic-only stacks should review the MSP partner path and Deployment Safety Assessment when board visibility on vendor risk is required.

FAQ

Is the Claude API still down?

Anthropic marked the June 5, 2026 incident resolved at 18:28 UTC. Check status.anthropic.com before acting; June 2026 also shows smaller prior degradations on June 1 through June 3.

Which Claude models were affected?

Anthropic's resolved update lists Opus 4.6, Sonnet 4.6, Opus 4.8, Opus 4.7, and Opus 4.5 with staggered recovery between 15:25 UTC and 17:29 UTC on June 5, 2026.

Does this change our Opus 4.8 PSF assessment?

No. Model capability ratings and incident reliability are separate evidence. Re-run production evals after outages, but do not conflate a shared API incident with regression in reasoning quality.

Should we switch models within Anthropic only?

Switching slugs within the same vendor helped only after each model recovered. For correlated failures, assessments indicate a secondary provider or queued degradation mode is the safer control.

What credential path fits vendor-risk incidents?

Individual practitioners should understand failover design through AIDA. Teams presenting vendor risk to leadership should pair this brief with DSA or the certification directory.

Sources

Apply today's signal

Turn the release into proof you can use.

Use the PSF to understand the control change, then choose the proof path that matches your role. Most readers should start with a personal credential; buyers and MSPs can branch from there.

The Production AI Brief