New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
PAI Lab public benchmark

Public agent repositories, measured against visible PSF evidence.

PAI scans public GitHub metadata and file paths for signs of production AI discipline: evals, output schemas, observability, deployment gates, human oversight, security policy, and provider resilience. This is evidence coverage, not certification.

Repositories24
Eval evidence3
Human oversight5
Observability17
Evidence coverage table

Recently active public AI agent repositories

Projects are discovered through GitHub repository search, then scanned for visible PSF-aligned evidence in their public file tree. Higher coverage means more evidence was visible to the scanner, not that PAI has certified or endorsed the project.

GitHub public repository search
Repository
Coverage
Grade
Visible evidence
can1357/oh-my-pi

⌥ AI Coding agent for the terminal — hash-anchored edits, optimized tool harness, LSP, Python, browser, subagents, and more

14,859 starsTypeScriptUpdated Jun 27, 2026ai-agentai-coding-agentanthropicIssue helper
82%
A
D12/2
D22/2
D31/2
D42/2
D52/2
D61/2
D71/2
D82/2
AI observability instrumentationcrates/pi-shell/src/minimizer/filters/fixtures/glab/ci-trace.txt | crates/pi-shell/tests/fixtures/minimizer/glab/ci-trace.cmd | crates/pi-shell/tests/fixtures/minimizer/glab/ci-trace.raw
Human approval gatespackages/coding-agent/test/tools/ssh-url-approval-gate.test.ts
CherryHQ/cherry-studio

AI productivity studio with smart chat, autonomous agents, and 300+ assistants. Unified access to frontier LLMs

47,877 starsTypeScriptUpdated Jun 27, 2026agent-skillsai-agentawesome-skillsIssue helper
79%
A
D12/2
D21/2
D32/2
D41/2
D52/2
D61/2
D71/2
D82/2
AI observability instrumentationdocs/references/ai/observability.md | packages/mcp-trace | packages/mcp-trace/trace-core
Human approval gates.github/workflows/claude-code-review.yml | .github/workflows/v2-daily-preview-build.yml | src/main/ai/tools/adapters/aiSdk/__tests__/isApprovalGated.test.ts
gmickel/flow-next

Spec-driven AI workflow plugin for Claude Code, OpenAI Codex, and Factory Droid. Zero-dep task tracking, worker subagents, Ralph autonomous mode, cross-model reviews.

645 starsPythonUpdated Jun 27, 2026agentic-workflowai-agentai-workflowIssue helper
73%
A
D12/2
D22/2
D30/2
D42/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationplugins/flow-next/agents/observability-scout.md | plugins/flow-next/codex/agents/observability-scout.toml
Human approval gates.flow/specs/fn-48-backend-split-review-workflows-flowctl.json | .flow/specs/fn-48-backend-split-review-workflows-flowctl.md | .flow/tasks/fn-48-backend-split-review-workflows-flowctl.1.json
vm0-ai/vm0

Zero, your trustworthy AI teammate for real work.

1,133 starsTypeScriptUpdated Jun 27, 2026agentic-workflowai-agentai-runtimeIssue helper
67%
A
D12/2
D21/2
D31/2
D42/2
D52/2
D60/2
D71/2
D81/2
AI observability instrumentation.github/scripts/tests/vm0-monitoring-collect-test.sh | ansible/files/vm0-monitoring-collect.sh | ansible/playbooks/provision-monitoring.yml
Security policy and secret hygiene.github/dependabot.yml | SECURITY.md | crates/runner/mitm-addon/tests/test_compiled_firewall_malformed_permissions.py
ZhuLinsen/daily_stock_analysis

LLM 驱动的多市场股票智能分析系统:多源行情、实时新闻、决策看板与自动推送,支持零成本定时运行。 LLM-powered multi-market stock analysis system with multi-source market data, real-time news, decision dashboard, automated notifications, and cost-free scheduled runs.

50,422 starsPythonUpdated Jun 27, 2026a-stockai-agentaigcIssue helper
66%
A
D12/2
D21/2
D31/2
D41/2
D52/2
D61/2
D71/2
D81/2
AI observability instrumentationsrc/agent/provider_trace.py | tests/test_llm_adapter_provider_trace.py | tests/test_provider_trace.py
Human approval gates.github/workflows/pr-review.yml
MikkoParkkola/trvl

Travel MCP server + CLI for AI assistants — flights, hotels, trains, cars, ferries, price alerts. No API keys, single Go binary. 1 smart MCP tool + 66 aliases (98.9% smaller tools/list). Works with Claude, Cursor, Windsurf, Codex.

44 starsGoUpdated Jun 27, 2026ai-agentclaudecliIssue helper
58%
A
D12/2
D21/2
D31/2
D41/2
D51/2
D60/2
D71/2
D82/2
AI observability instrumentationcapabilities/trvl_hotel_prices.yaml | capabilities/trvl_hotel_reviews.yaml | capabilities/trvl_search_hotels.yaml
Security policy and secret hygiene.github/dependabot.yml | .github/workflows/dependabot-auto-merge.yml | SECURITY.md
agentforce314/clawcodex

Token efficient Claude Code full Python rebuild. AI Coding Agent in 230K LoC pure Python. Up to 200X Cost Saving!

555 starsPythonUpdated Jun 27, 2026agentaiai-agentIssue helper
50%
A
D11/2
D22/2
D31/2
D41/2
D51/2
D60/2
D71/2
D81/2
AI observability instrumentationsrc/services/mcp/telemetry.py | src/upstreamproxy/ptrace_guard.py | tests/upstreamproxy/test_ptrace_guard.py
Security policy and secret hygieneclaude-code-wiki/raw/claude-code-sourcemap-learning-notebook/en/03_permission_security.md | claude-code-wiki/wiki/concepts/permission-modes.md | claude-code-wiki/wiki/concepts/permission-system.md
igdigitallab/cardloop

An IDE for managing software projects with the Claude Agent SDK — a web cockpit + a kanban board that auto-runs cards as full-auto agent sessions.

5 starsPythonUpdated Jun 27, 2026agenticai-agentanthropicIssue helper
50%
B
D12/2
D20/2
D31/2
D42/2
D52/2
D60/2
D71/2
D80/2
AI observability instrumentationtests/test_spec035_live_trace.py
Security policy and secret hygieneSECURITY.md | secret.py | secretstore.py
lianluo-esign/ferrogate

Open-source Rust/Pingora AI gateway and reverse proxy for self-hosted LLM traffic control: OpenAI-compatible Chat/Responses, provider routing/fallback, virtual API keys, policy and budgets, exact-match cache, MCP/tool execution, observability, Admin APIs/dashboard, cluster ops, and automatic HTTPS.

17 starsRustUpdated Jun 27, 2026aiai-agentai-gatewayIssue helper
49%
A
D11/2
D20/2
D31/2
D41/2
D51/2
D61/2
D71/2
D81/2
AI observability instrumentationcrates/ferrogate-cli/src/telemetry.rs | crates/ferrogate-observability | crates/ferrogate-observability/Cargo.toml
Human approval gatescrates/ferrogate-cli/src/approval.rs
yashab-cyber/opendroid

Your Open Autonomous Android Agent — A production-ready, self-planning AI assistant powered by local/remote LLMs and accessibility-driven screen automation.

52 starsKotlinUpdated Jun 27, 2026accessibilityai-agentandroidIssue helper
46%
A
D11/2
D21/2
D31/2
D40/2
D52/2
D60/2
D71/2
D81/2
Security policy and secret hygieneSECURITY.md
Provider fallback or degraded modeapp/src/main/java/com/opendroid/ai/core/llm/providers/CustomOpenAIProvider.kt | app/src/main/java/com/opendroid/ai/core/llm/providers/GeminiProvider.kt | app/src/main/java/com/opendroid/ai/core/llm/providers/OpenAIProvider.kt
FMXExpress/PasClaw

AI agent in Delphi Object Pascal

33 starsPascalUpdated Jun 27, 2026agentagentic-aiaiIssue helper
45%
A
D11/2
D20/2
D31/2
D42/2
D51/2
D60/2
D71/2
D81/2
AI observability instrumentationdocs/observability.md | src/pkg/otel | src/pkg/otel/PasClaw.Otel.pas
Security policy and secret hygienedocs/security.md | src/tests/config_secret_merge_tests.pas | src/tests/fs_secret_gate_tests.pas
mhawthorne/gza

AI coding assistant task and workflow manager

11 starsPythonUpdated Jun 27, 2026ai-agentai-coding-toolsclaude-codeIssue helper
43%
B
D12/2
D21/2
D30/2
D41/2
D52/2
D60/2
D70/2
D81/2
Provider fallback or degraded modesrc/gza/providers/gemini.py
Release and deployment gates.github/workflows/pypi.yml | .github/workflows/test.yml | .github/workflows/testpypi.yml
bitrouter/bitrouter

An open-source agentic LLM gateway & router that cost-optimize your agentic workflows in your way. works with any harness, any model

185 starsRustUpdated Jun 27, 2026acpagent-guardrailsagent-harnessIssue helper
41%
A
D11/2
D21/2
D31/2
D41/2
D51/2
D60/2
D70/2
D81/2
AI observability instrumentationplugins/bitrouter-observe/src/otel | plugins/bitrouter-observe/src/otel/auth_client.rs | plugins/bitrouter-observe/src/otel/bearer.rs
Provider fallback or degraded modecrates/bitrouter-providers/src/anthropic | crates/bitrouter-providers/src/anthropic/headers.rs | crates/bitrouter-providers/src/anthropic/mod.rs
shadow3aaa/DaatLocus

An agent runtime.

9 starsRustUpdated Jun 27, 2026agentagent-frameworkaiIssue helper
39%
B
D12/2
D21/2
D30/2
D41/2
D52/2
D60/2
D70/2
D80/2
AI observability instrumentationprompts/program-runtime-turn-trace-judge-instructions.md | prompts/program-runtime-turn-trace-judge-system.md | src/reasoning/programs/runtime_turn_trace_judge.rs
Schema or contract validationschemas/config.schema.json
1ay1/agentty

AI pair programming in your terminal — one static binary, sub-ms startup, any model

19 starsC++Updated Jun 27, 2026acpagentic-codingai-agentIssue helper
32%
B
D10/2
D20/2
D31/2
D40/2
D52/2
D60/2
D71/2
D81/2
Security policy and secret hygienedocs/agent_panel/09_permissions.md | include/agentty/runtime/view/thread/turn/permission.hpp | src/runtime/view/thread/turn/permission.cpp
Provider fallback or degraded modeinclude/agentty/provider/anthropic | include/agentty/provider/anthropic/oauth.hpp | include/agentty/provider/anthropic/provider.hpp
jackwener/OpenCLI

Make Any Website into CLI & Use your logged-in browser by AI agent.

25,449 starsJavaScriptUpdated Jun 27, 2026ai-agentai-agentsai-toolsIssue helper
29%
B
D11/2
D20/2
D30/2
D41/2
D51/2
D60/2
D70/2
D81/2
AI observability instrumentationclis/ctrip/hotel-search.js | clis/ctrip/hotel-suggest.js
Provider fallback or degraded modedocs/adapters/browser/gemini.md
shy3130/tickflow-stock-panel

自托管、零运维的 A 股「选股 + 监控 + 回测」量化工作台 | 基于 TickFlow 数据 | LLM能力驱使策略定制+个股分析+复盘 | 自由接入第三方个性化扩展数据

397 starsTypeScriptUpdated Jun 27, 2026a-stockai-agentaigcIssue helper
27%
B
D10/2
D20/2
D31/2
D41/2
D51/2
D60/2
D71/2
D80/2
AI observability instrumentationfrontend/src/pages/settings/Monitoring.tsx
Security policy and secret hygienebackend/app/secrets_store.py | frontend/src/pages/backtest/FactorBacktest.tsx
oratis/LISA

An AI agent with a real self — soul she wrote, desires that drive her, a heartbeat for autonomous action, dreams she processes when you're away. Capability superset of pi-mono / OpenClaw / hermes-agent / claude-code / codex.

6 starsTypeScriptUpdated Jun 27, 2026agentai-agentai-assistantIssue helper
25%
B
D11/2
D20/2
D31/2
D40/2
D51/2
D60/2
D70/2
D81/2
Provider fallback or degraded modesrc/providers/anthropic.test.ts | src/providers/anthropic.ts | src/providers/fallback.test.ts
Release and deployment gates.github/workflows/ci.yml | .github/workflows/release-homebrew.yml | .github/workflows/release-ios-testflight.yml
yusong652/itasca-mcp

MCP server connecting AI agents to ITASCA engines (PFC, FLAC, 3DEC, MPoint, MassFlow) — run numerical simulations through natural conversation

104 starsPythonUpdated Jun 27, 20263decai-agentclaudeIssue helper
22%
C
D11/2
D20/2
D30/2
D41/2
D51/2
D60/2
D70/2
D80/2
AI observability instrumentationsrc/itasca_mcp/knowledge/resources/3dec/command_docs/commands/block/gridpoint-trace.json | src/itasca_mcp/knowledge/resources/3dec/command_docs/commands/block/trace.json | src/itasca_mcp/knowledge/resources/_common/command_docs/commands/fish/trace.json
Release and deployment gates.github/workflows/publish.yml | .github/workflows/test.yml | docker/Dockerfile
voidly-ai/voidly-pay

Off-chain credit ledger + hire marketplace for AI agents. Ed25519-signed envelopes, atomic settlement, hire-and-release escrow. https://voidly.ai/pay

10 starsJavaScriptUpdated Jun 27, 2026a2aagent-paymentsagent-to-agentIssue helper
22%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D81/2
Security policy and secret hygieneSECURITY.md
Provider fallback or degraded modeadapters/openai-compat | adapters/openai-compat/README.md | adapters/openai-compat/package.json
taskade/mcp

🤖 Taskade MCP · Official MCP server and OpenAPI to MCP codegen. Build AI agent tools from any OpenAPI API and connect to Claude, Cursor, and more.

151 starsTypeScriptUpdated Jun 27, 2026aiai-agentai-agentsIssue helper
15%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D80/2
Security policy and secret hygieneSECURITY.md
Release and deployment gates.github/workflows/ci.yml | .github/workflows/force-release.yml | .github/workflows/publish-mcp-registry.yml
alexvilelabah/bah-browser

Bah — navegador com IA (estilo Perplexity Comet), por VilelaLab. Comandos em linguagem natural operam o navegador. Electron + React + DeepSeek/Ollama.

38 starsTypeScriptUpdated Jun 27, 2026ai-agentbrowser-automationcometIssue helper
15%
C
D10/2
D20/2
D30/2
D40/2
D51/2
D60/2
D71/2
D80/2
Security policy and secret hygieneSECURITY.md
Release and deployment gates.github/workflows/linux-test.yml | .github/workflows/mac-test.yml
JokerJohn/openclaw-autotrader

A 30-day public U.S. stock challenge: follow a 5000 HKD 🦞 claw through live market days.

38 starsJavaScriptUpdated Jun 27, 2026ai-agentalgorithmic-tradingautotraderIssue helper
12%
C
D10/2
D21/2
D30/2
D41/2
D50/2
D60/2
D70/2
D80/2
Schema or contract validationxhs-agent/schemas/public-snapshot.schema.json | xhs-agent/schemas/xhs-post-package.schema.json
Incident or drift evidencedocs/incidents | docs/incidents/.gitkeep
hungf1511/awesome-prompt-engineering

✨ Explore essential resources and techniques for effective prompt engineering with Large Language Models, enhancing your AI interaction skills.

6 starsUpdated Jun 27, 2026aiai-agentawesomeIssue helper
0%
U
D10/2
D20/2
D30/2
D40/2
D50/2
D60/2
D70/2
D80/2
No PSF evidence paths detectedPublic scan did not find matching path evidence.

Where the repository list comes from

The benchmark uses GitHub's public repository search endpoint and rotates focused queries for AI agent, agentic AI, LLM agent, and MCP server repositories. The run de-duplicates repositories, excludes archived projects and forks when GitHub returns those flags, and sorts the published table by visible PSF evidence coverage.

  • topic:ai-agent archived:false fork:false stars:>=5
  • topic:agentic-ai archived:false fork:false stars:>=5
  • topic:llm-agent archived:false fork:false stars:>=5
  • topic:mcp-server archived:false fork:false stars:>=5
  • "ai agent" in:name,description,readme archived:false fork:false stars:>=5

How teams use this

The benchmark gives maintainers and production AI teams a concrete way to improve visible evidence. A project can publish the missing artifacts, run its own Agent Readiness report, and link to a stable monthly edition when citing broader ecosystem findings.

  • Use the live table to inspect current public evidence patterns.
  • Use immutable editions for citations, journalism, and longitudinal comparison.
  • Use the issue generator to turn a gap into a constructive maintainer task.
  • Use the evidence pack and control templates to publish the missing artifacts.
Evidence pack builderIssue generatorControl templates
Start here — production AI

Foundational reference pages for practitioners and teams evaluating production AI safety, agent readiness, and certification paths.

What is production AI?AI agent production ready checklistAI certification comparedAI-proof your careerWorkflowOS open-source PSF studioPSF standard →