New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
PAI research programme

Public research for production AI deployment

Position papers, incident analysis, framework assessments, and monthly intelligence that make production AI safety inspectable. The work is public because a standard only matters if people can inspect it.

Read the latest Briefing →Browse all Insights
Open methodologyIncident patternsFramework analysisPractitioner evidence
Intelligence Briefing

Monthly · Free · Practitioner-written

All issues →
Issue
001
May 2026

Karpathy's Third Chip Flip · Software 3.0 · The Agent Operator · PAI Patterns Library

Five developments shaping production AI this month — with the PAI angle on what they mean for practitioners building real systems today.

Read Issue 001 →

Get the next issue in your inbox

Free. Monthly. Unsubscribe anytime.

Subscribe →
From research to action

Use the evidence to choose a proof path.

Research creates leverage only when the next step is obvious. Pick the route that matches the pressure you are under.

All research areas

LABQ2 2026

PAI Lab

Structured reliability testing of frontier AI models and agent frameworks against PSF criteria. Quarterly scorecards. Open methodology.

Explore →

R&I50+ articles

Insights

Analysis, essays, and deep-dives on AI deployment, safety, and the production practitioner experience.

Explore →

PTR21 patterns

Patterns Library

Reusable workflow patterns for production AI systems — vetted against the PSF and ready to adapt.

Explore →

INCCase studies

Incidents

Documented AI failure cases with root-cause analysis mapped to PSF domains. Learn from what went wrong.

Explore →

ECO12 frameworks

Ecosystem Reports

Independent PSF assessments of every major AI framework — LangChain, CrewAI, AutoGen, Cursor, and more.

Explore →

PSFPAI-8 / PSF

Standard

The Production Safety Framework itself — eight domains, openly published and freely referenceable.

Explore →

ARIPublic index

Agent Readiness Index

Generate a PSF-aligned readiness report for an AI agent, with evidence grade, repository signals, and a shareable badge.

Explore →

Recent findings (2026)

May–June 2026 lab and ecosystem work — indexed here for procurement visitors; formal publication titles below are unchanged.

Ecosystem assessmentJune 2026

Cursor Enterprise Organizations — PSF assessment

Multi-team governance, org-level IdP, and usage analytics for Cursor Enterprise Organizations GA (3 Jun 2026).

Read →
Ecosystem assessmentJune 2026

OpenAI Codex Sites & role plugins — PSF assessment

Enterprise plugins, hosted Sites, and human-refinement signals for Codex knowledge work (2 Jun 2026).

Read →
Ecosystem assessmentJune 2026

OpenAI on Amazon Bedrock — PSF assessment

AWS-native governance and regional inference for OpenAI models and Codex on Bedrock GA (1 Jun 2026).

Read →
Open sourceJune 2026

Why we open-sourced WorkflowOS

PSF Workflow Studio released under MIT — the working artifact behind the open Production Safety Framework text.

Read →
Lab reportJune 2026

PAI Lab: public GitHub agent readiness (May 2026 cohort)

Empirical scan of 20 public agent repos against PSF evidence signals (PAI-ARI-2026.1).

Read →
Incident analysisJune 2026

OpenAI May 2026 multi-service outage

Vendor-reported ChatGPT, login, and checkout failures on 29 May 2026 — mapped to D8, D4, and D5; indexed in the incident registry.

Read →
Incident analysisMay 2026

Binnall Law — Claude Console phantom citations in federal court

Verified May 2026 filing failure mapped to D2, D5, and D6 — indexed in the incident registry.

Read →
Data use indexMay 2026

AI Data Use Index — week 5 (May 2026)

Weekly practitioner index of vendor and product data-use posture changes.

Read →
Ecosystem assessmentMay 2026

OpenAI Codex CLI 0.134 — PSF assessment

Independent PSF coverage review of Codex CLI for production agent workflows.

Read →
Ecosystem assessmentMay 2026

Cursor Automations 3.5 — PSF assessment

Vendor resilience and deployment-safety signals for Cursor Automations 3.5.

Read →
Ecosystem assessmentMay 2026

Google Agent Executor — PSF assessment

Framework assessment of Google Agent Executor against PSF domains.

Read →

PAI Publications

Position papers, analyses, and framework notes from the PAI research programme.

Position paper2025

The EU AI Act and the Production Safety Framework: A Practitioner's Guide

Maps PSF domains to EU AI Act obligations for high-risk AI system deployers. Covers conformity assessment requirements, technical documentation standards, and human oversight obligations.

EU AI ActComplianceRegulation
Read the paper →
Analysis2025

Incident Patterns in Production LLM Deployments

Analysis of common failure modes in production LLM deployments. Identifies root causes across PSF domains and intervention patterns.

IncidentsLLMRoot cause
Read the paper →
Position paper2024

Human Oversight in High-Stakes AI: What 'Meaningful' Means in Practice

Examines what constitutes meaningful human oversight in high-stakes AI-assisted decisions. Includes design patterns for effective human checkpoints.

Human oversightPSF Domain 05Design patterns
Read the paper →
Framework note2024

PSF v1.0 Rationale and Development History

Documents the reasoning behind each PSF domain, alternatives considered, and how practitioner feedback shaped the framework.

PSFFramework development
Read the paper →

Redacted assurance findings (cross-client patterns)

Patterns repeatedly observed in anonymised production assurance reviews and incident-led postmortems:

D2 output contracts missing

Model text consumed by downstream systems as if it were trusted structured data.

Review PSF-D2
D4 observability blind spots

Strong model-call logs, weak cross-service traceability at queue and handoff boundaries.

See Lab methodology
D5/D6 autonomy without gates

Operational actions executed without explicit human intervention criteria on high-consequence paths.

Use deployment guidance
Read the PSF →Ecosystem assessments →

Contribute to the research programme

PAI collects anonymised incident reports from practitioners to inform framework development. If you have experienced a production AI failure and are willing to share details, we welcome your contribution.

Submit an incident reportRead the PSF
The Production AI Brief

Get the brief that keeps AI work defensible

PSF updates, deployment checks, failure patterns, and proof paths for practitioners, MSPs, and teams who need AI work to survive scrutiny. No hype.