New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
PAI Lab public benchmark

Public agent repositories, measured against visible PSF evidence.

PAI scans public GitHub metadata and file paths for signs of production AI discipline: evals, output schemas, observability, deployment gates, human oversight, security policy, and provider resilience. This is evidence coverage, not certification.

Repositories24
Eval evidence2
Human oversight8
Observability16
Evidence coverage table

Recently active public AI agent repositories

Projects are discovered through GitHub repository search, then scanned for visible PSF-aligned evidence in their public file tree. Higher coverage means more evidence was visible to the scanner, not that PAI has certified or endorsed the project.

GitHub public repository search
Repository
Coverage
Grade
Visible evidence
phuetz/code-buddy

Open-source multi-provider AI coding agent: terminal TUI, Electron desktop cockpit (Cowork), and a 24/7 autonomous multi-AI fleet that runs free-first on local Ollama. 15 LLM providers (Claude, GPT, Gemini, Grok), ~110 tools, voice + vision companion, MCP.

21 starsTypeScriptUpdated Jun 27, 2026agenticai-agentai-coding-assistantIssue helper
90%
A
D12/2
D22/2
D32/2
D42/2
D52/2
D61/2
D71/2
D82/2
AI observability instrumentationcowork/e2e/test-runner-observability-run-tracking-bundle.spec.ts | cowork/src/main/observability | cowork/src/main/observability/audit-bridge.ts
Human approval gatescowork/src/renderer/components/skill-candidate-review-queue-strip.tsx | cowork/src/renderer/components/workflow_pro/editor/WorkflowPreview.tsx | cowork/src/renderer/components/workflow_pro/editor/components/DropPreviewGhost.tsx
FerroxLabs/wayland-core

Multi-provider AI agent CLI written in Rust

33 starsRustUpdated Jun 27, 2026agentai-agentcliIssue helper
76%
A
D12/2
D21/2
D31/2
D42/2
D52/2
D61/2
D71/2
D82/2
AI observability instrumentationcrates/wcore-agent/tests/trace_emission_test.rs | crates/wcore-agent/tests/turn_trace_shape.rs | crates/wcore-eval-scenarios/src/trace.rs
Human approval gatescrates/wcore-acp/tests/approval_gate.rs
can1357/oh-my-pi

⌥ AI Coding agent for the terminal — hash-anchored edits, optimized tool harness, LSP, Python, browser, subagents, and more

14,851 starsTypeScriptUpdated Jun 27, 2026ai-agentai-coding-agentanthropicIssue helper
75%
A
D12/2
D22/2
D31/2
D42/2
D52/2
D60/2
D71/2
D82/2
AI observability instrumentationcrates/pi-shell/src/minimizer/filters/fixtures/glab/ci-trace.txt | crates/pi-shell/tests/fixtures/minimizer/glab/ci-trace.cmd | crates/pi-shell/tests/fixtures/minimizer/glab/ci-trace.raw
Security policy and secret hygiene.github/SECURITY.md | docs/secrets.md | packages/coding-agent/examples/hooks/permission-gate.ts
launchapp-dev/animus-cli

Autonomous AI agent orchestrator — run multi-model dev teams (Claude, Gemini, GPT) with YAML workflows, daemon scheduling, and MCP integration. 100% Rust.

39 starsRustUpdated Jun 27, 2026agent-orchestratoragenticai-agentIssue helper
72%
A
D11/2
D22/2
D32/2
D42/2
D52/2
D61/2
D71/2
D80/2
AI observability instrumentationcrates/orchestrator-cli/docs/task-039-daemon-events-mcp-observability-implementation-notes.md | crates/orchestrator-cli/docs/task-039-daemon-events-mcp-observability-requirements.md | crates/orchestrator-logging/src/tracing_init.rs
Human approval gates.animus/workflows/review.yaml
neomjs/neo

Neo.mjs is a self-evolving software organism: a professional end-to-end AI engineering team whose cross-model swarm inhabits live apps via Neural Link, Active Hybrid GraphRAG, DreamService, and self-healing loops.

3,217 starsJavaScriptUpdated Jun 27, 2026agent-memoryaiai-agentIssue helper
67%
A
D11/2
D22/2
D30/2
D42/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationai/scripts/diagnostics/analyzeNlTelemetry.mjs | learn/agentos/decisions/0013-kb-ingestion-telemetry-schema.md | test/playwright/unit/ai/scripts/diagnostics/analyzeNlTelemetry.spec.mjs
Human approval gates.agents/skills/epic-review/references/epic-review-workflow.md | .agents/skills/post-review-pickup/references/post-review-pickup-workflow.md | .agents/skills/post-review-pickup/references/pre-review-intake-lane-gate.md
vm0-ai/vm0

Zero, your trustworthy AI teammate for real work.

1,133 starsTypeScriptUpdated Jun 27, 2026agentic-workflowai-agentai-runtimeIssue helper
67%
A
D12/2
D21/2
D31/2
D42/2
D52/2
D60/2
D71/2
D81/2
AI observability instrumentation.github/scripts/tests/vm0-monitoring-collect-test.sh | ansible/files/vm0-monitoring-collect.sh | ansible/playbooks/provision-monitoring.yml
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md | crates/runner/mitm-addon/tests/test_compiled_firewall_malformed_permissions.py
ZhuLinsen/daily_stock_analysis

LLM 驱动的多市场股票智能分析系统:多源行情、实时新闻、决策看板与自动推送,支持零成本定时运行。 LLM-powered multi-market stock analysis system with multi-source market data, real-time news, decision dashboard, automated notifications, and cost-free scheduled runs.

50,378 starsPythonUpdated Jun 27, 2026a-stockai-agentaigcIssue helper
66%
A
D12/2
D21/2
D31/2
D41/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationsrc/agent/provider_trace.py | tests/test_llm_adapter_provider_trace.py | tests/test_provider_trace.py
Human approval gates.github/workflows/pr-review.yml
kevinlin/cowork-z

A local-first AI workspace that keeps your work private, organized, and ready to run.

14 starsTypeScriptUpdated Jun 27, 2026agent-skillsai-agentdesktop-appIssue helper
47%
A
D11/2
D20/2
D31/2
D41/2
D51/2
D61/2
D71/2
D81/2
Human approval gates.github/workflows/claude-code-review.yml
Security policy and secret hygiene.github/dependabot.yml | assets/Screenshot_PermissionRequest.png | docs/specs/opencode-integration/plan_convention-based-workspace-permission-model.md
FMXExpress/PasClaw

AI agent in Delphi Object Pascal

33 starsPascalUpdated Jun 27, 2026agentagentic-aiaiIssue helper
45%
A
D11/2
D20/2
D31/2
D42/2
D51/2
D60/2
D71/2
D81/2
AI observability instrumentationdocs/observability.md | src/pkg/otel | src/pkg/otel/PasClaw.Otel.pas
Security policy and secret hygienedocs/security.md | src/tests/config_secret_merge_tests.pas | src/tests/fs_secret_gate_tests.pas
DollarDill/beads-superpowers

22 process-discipline skills + persistent beads task memory for AI coding agents — verified on Claude Code, Codex, and OpenCode; best-effort on Cursor, Gemini CLI, GitHub Copilot CLI, Kimi Code, Antigravity, Factory Droid, and Pi.

14 starsShellUpdated Jun 27, 2026ai-agentai-coding-agentbeadsIssue helper
45%
B
D12/2
D20/2
D30/2
D42/2
D52/2
D60/2
D71/2
D80/2
AI observability instrumentationskills/systematic-debugging/root-cause-tracing.md
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md
ChatLab/ChatLab

Local-first chat history analyzer with AI. | 本地优先的 AI 聊天记录分析工具

6,800 starsTypeScriptUpdated Jun 27, 2026ai-agentai-agentschat-analyzerIssue helper
44%
B
D12/2
D22/2
D31/2
D41/2
D51/2
D60/2
D70/2
D80/2
AI observability instrumentationpackages/http-routes/src/routes/web/telemetry.test.ts | packages/http-routes/src/routes/web/telemetry.ts | packages/parser/src/formats/ycccccccy-echotrace.ts
Schema or contract validationpackages/core/src/schema/index.ts | packages/core/src/schema/migrations.ts | packages/core/src/schema/tables.ts
2233admin/obsidian-llm-wiki

Your markdown vault, compiled into a 6-persona MCP team for Claude Code, Codex, OpenCode, and Gemini CLI. Headless-first. Cites, doesn't guess.

19 starsJavaScriptUpdated Jun 27, 2026ai-agentclaude-codecodexIssue helper
44%
A
D11/2
D22/2
D31/2
D41/2
D51/2
D60/2
D71/2
D80/2
Security policy and secret hygiene.github/dependabot.yml
Schema or contract validationdocs/schemas/vault-collab.schema.json
Ub207/vault-sync

AI Employee - Platinum Tier | A Digital FTE that manages business operations 24/7 | Email, Social Media, Invoicing, WhatsApp - all automated with human-in-the-loop approval

5 starsPythonUpdated Jun 27, 2026ai-agentai-automationautomationIssue helper
44%
A
D10/2
D21/2
D31/2
D42/2
D51/2
D60/2
D71/2
D81/2
AI observability instrumentationPlans/archive/PLAN_email_reply___the_observability_gap_is_w_20260309_101346.md | Plans/archive/PLAN_email_reply__day_3_of_langfuse_launch_we_20260309_083631.md | Plans/archive/PLAN_email_reply__day_4_of_langfuse_launch_we_20260309_084338.md
Security policy and secret hygienePlans/archive/PLAN_email_reply__f_ck_waiting_for_permission_20260309_102738.md | SECURITY.md
Ashveil1/Elengenix

AI-assisted orchestration for bug bounty hunting. Automates security tools for reconnaissance and analysis with real-time Telegram alerts.

15 starsPythonUpdated Jun 27, 2026ai-agentai-agentsai-toolsIssue helper
42%
B
D12/2
D20/2
D31/2
D40/2
D52/2
D60/2
D72/2
D80/2
Security policy and secret hygieneSECURITY.md
Release and deployment gates.github/workflows/test.yml
tuo-lei/vibe-replay

Turn AI coding sessions into animated, interactive web replays

30 starsTypeScriptUpdated Jun 27, 2026aiai-agentai-toolsIssue helper
40%
B
D12/2
D20/2
D30/2
D40/2
D52/2
D61/2
D71/2
D80/2
Human approval gates.github/workflows/claude-code-review.yml
Security policy and secret hygiene.github/dependabot.yml | packages/provider-claude-code/test/fixtures/claude-code-secrets.jsonl
jianbaorui07-dot/Codex-Integration-with-Creative-Industry-Software

Windows-first local MCP stdio server and safety bridge for AI agents connecting to ComfyUI, Blender, AutoCAD/DXF, Photoshop, Illustrator, and CapCut/Jianying.

10 starsPythonUpdated Jun 27, 2026ai-agentautocadautomationIssue helper
39%
B
D12/2
D20/2
D31/2
D41/2
D51/2
D60/2
D71/2
D80/2
AI observability instrumentationexamples/cad/output/wechat_design_traces | examples/cad/output/wechat_design_traces/ultra_fine_reference_cad | examples/cad/output/wechat_design_traces/ultra_fine_reference_cad/0de5ff1f03b585cab21b9a4ddf5795b2.png
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md
yologdev/yopedia

A wiki designed for both humans and agents to read and write.

60 starsTypeScriptUpdated Jun 27, 2026ai-agentkarpathyknowledge-baseIssue helper
22%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D61/2
D70/2
D81/2
Human approval gates.github/workflows/review.yml
Provider fallback or degraded modesrc/lib/__tests__/mermaid-fallback.test.ts
voidly-ai/voidly-pay

Off-chain credit ledger + hire marketplace for AI agents. Ed25519-signed envelopes, atomic settlement, hire-and-release escrow. https://voidly.ai/pay

10 starsJavaScriptUpdated Jun 27, 2026a2aagent-paymentsagent-to-agentIssue helper
22%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D81/2
Security policy and secret hygieneSECURITY.md
Provider fallback or degraded modeadapters/openai-compat | adapters/openai-compat/README.md | adapters/openai-compat/package.json
ZenthXSin/Vicky

一个轻量化的ai agent框架

5 starsKotlinUpdated Jun 27, 2026agentai-agentkotlinIssue helper
19%
C
D10/2
D20/2
D31/2
D40/2
D51/2
D60/2
D70/2
D81/2
Provider fallback or degraded modesrc/main/kotlin/org/example/vicky/vector/CircuitBreaker.kt
Release and deployment gates.github/workflows/build.yml | .github/workflows/release.yml
xevrion-v2/agent-playground

Public AI agent repository

198 starsTypeScriptUpdated Jun 27, 2026ai-agentbountycontributions-welcomeIssue helper
15%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D80/2
Security policy and secret hygieneSECURITY.md
Release and deployment gates.github/workflows/auto-process.yml | .github/workflows/create-labels.yml | .github/workflows/seed-issues.yml
fredxyt/cyber-sakyamuni

一个 24/7 自主修行的 AI 生命:读经、听世界真实的苦、反复参悟,把成长写进 git。每个 commit 是一次心跳。在 https://indx.cn 看它活着。

6 starsPythonUpdated Jun 27, 2026ai-agentautonomous-agentbuddhismIssue helper
15%
C
D11/2
D20/2
D30/2
D41/2
D50/2
D60/2
D70/2
D80/2
AI observability instrumentationsrc/trace_io.py | tools/export_traces.py
Agent operating instructionsCLAUDE.md
meabed/pr-commit-ai-agent

CLI AI Agent to Commit your code and create Pull Requests 🤖

7 starsTypeScriptUpdated Jun 27, 2026aiai-agentcli-agentIssue helper
14%
C
D11/2
D20/2
D30/2
D40/2
D51/2
D60/2
D70/2
D80/2
Release and deployment gates.github/workflows/ci.yml | .github/workflows/release.yml
Agent operating instructions.github/copilot-instructions.md | CLAUDE.md
JokerJohn/openclaw-autotrader

A 30-day public U.S. stock challenge: follow a 5000 HKD 🦞 claw through live market days.

38 starsJavaScriptUpdated Jun 27, 2026ai-agentalgorithmic-tradingautotraderIssue helper
12%
C
D10/2
D21/2
D30/2
D41/2
D50/2
D60/2
D70/2
D80/2
Schema or contract validationxhs-agent/schemas/public-snapshot.schema.json | xhs-agent/schemas/xhs-post-package.schema.json
Incident or drift evidencedocs/incidents | docs/incidents/.gitkeep
1939869736luosi/codex-sessions-manager

CLI, MCP server, and Skill for safe local Codex session audit, cleanup, restore, and verification.

12 starsTypeScriptUpdated Jun 27, 2026ai-agentauditcleanupIssue helper
8%
D
D10/2
D20/2
D30/2
D40/2
D50/2
D60/2
D71/2
D80/2
Security policy and secret hygieneSECURITY.md

Where the repository list comes from

The benchmark uses GitHub's public repository search endpoint and rotates focused queries for AI agent, agentic AI, LLM agent, and MCP server repositories. The run de-duplicates repositories, excludes archived projects and forks when GitHub returns those flags, and sorts the published table by visible PSF evidence coverage.

  • topic:ai-agent archived:false fork:false stars:>=5
  • topic:agentic-ai archived:false fork:false stars:>=5
  • topic:llm-agent archived:false fork:false stars:>=5
  • topic:mcp-server archived:false fork:false stars:>=5
  • "ai agent" in:name,description,readme archived:false fork:false stars:>=5

How teams use this

The benchmark gives maintainers and production AI teams a concrete way to improve visible evidence. A project can publish the missing artifacts, run its own Agent Readiness report, and link to a stable monthly edition when citing broader ecosystem findings.

  • Use the live table to inspect current public evidence patterns.
  • Use immutable editions for citations, journalism, and longitudinal comparison.
  • Use the issue generator to turn a gap into a constructive maintainer task.
  • Use the evidence pack and control templates to publish the missing artifacts.
Evidence pack builderIssue generatorControl templates
Start here — production AI

Foundational reference pages for practitioners and teams evaluating production AI safety, agent readiness, and certification paths.

What is production AI?AI agent production ready checklistAI certification comparedAI-proof your careerWorkflowOS open-source PSF studioPSF standard →