Immutable benchmark edition

State of Agent Readiness: May 2026

A frozen PAI Lab edition of visible production-readiness evidence across public AI agent repositories. Use this URL for citations, reporting, and longitudinal comparison.

Repositories20
Average coverage38%
Human oversight20%
Observability55%
Edition finding

Deployment controls were visible; autonomy governance was not yet routine.

In this edition, the strongest visible domain was D5 Deployment control (85% of scanned repositories showed at least one signal). The weakest visible domain was D6 Human oversight (20%). This means maintainers can often show build and release discipline, but many still need explicit approval gates, eval evidence, and operational incident procedures.

Domain coverage

D165%
D240%
D330%
D455%
D585%
D620%
D770%
D860%

What changed into action

  • Publish a docs/production-ai-readiness.md evidence map that links each PSF domain to the relevant repository artifact.
  • Add an evals folder with regression cases, scoring thresholds, and the current model or prompt version under test.
  • Document human approval gates for high-impact actions, including who approves, what is logged, and what fails closed.
  • Add an incident response note, degraded-mode path, and provider fallback procedure.

Findings

  • Deployment control evidence appeared in 17 of 20 repositories, most often through CI workflows, release automation, or versioned deployment paths.
  • Human oversight evidence appeared in 4 of 20 repositories, making autonomy gates the clearest visible gap in this edition.
  • No repository in this public sample exposed eval evidence that matched the current scanner signal pattern.
  • Security and provider resilience signals were more visible than data stewardship, suggesting maintainers often document engineering controls before data governance controls.

Top visible evidence examples

Evidence pack builder
serac-labs/seracAI observability instrumentation, Human approval gates, Security policy and secret hygiene
84%
vm0-ai/vm0AI observability instrumentation, Security policy and secret hygiene, Provider fallback or degraded mode
77%
Icarus603/claude-codeAI observability instrumentation, Human approval gates, Security policy and secret hygiene
76%
HankHuang0516/EClawAI observability instrumentation, Human approval gates, Security policy and secret hygiene
63%
holaboss-ai/holaOSAI observability instrumentation, Security policy and secret hygiene, Provider fallback or degraded mode
59%
bug-ops/zephAI observability instrumentation, Security policy and secret hygiene, Provider fallback or degraded mode
58%
slicenferqin/xuanpuAI observability instrumentation, Security policy and secret hygiene, Provider fallback or degraded mode
54%
hrygo/hotplexAI observability instrumentation, Human approval gates, Security policy and secret hygiene
53%

Source and method

Repository discovery used GitHub public repository search. The scanner reviewed public repository metadata and public file-tree paths only. It did not clone repositories, inspect private code, or certify the projects listed here.

  • topic:ai-agent archived:false fork:false stars:>=5
  • topic:agentic-ai archived:false fork:false stars:>=5
  • topic:llm-agent archived:false fork:false stars:>=5
  • topic:mcp-server archived:false fork:false stars:>=5
  • "ai agent" in:name,description,readme archived:false fork:false stars:>=5

Citation note

Production AI Institute. "State of Agent Readiness - May 2026." Frozen May 13, 2026. Available at https://www.productionai.institute/agent-readiness/benchmark/2026-05.
  • This edition is an immutable archival snapshot. It does not change when the live benchmark changes.
  • The scan uses public repository metadata and public file-tree paths. It does not clone private code and it is not a certification or endorsement.
  • Authenticated production runs should use a server-side GitHub token to raise API limits before freezing future monthly editions.