New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
HomeResearchLLM Incident Patterns
AnalysisJune 2025CC BY 4.0

Incident Patterns in Production LLM Deployments

A synthesis of common failure modes observed across production LLM deployments, drawing on patterns identified through PAI assessor reviews, CPAP portfolio submissions, and publicly reported post-mortems. This paper maps failure modes to PSF domains and describes the intervention patterns that resolve them.

Key findings

>2/3
failures trace to PSF-1 or PSF-2 gaps
~4×
faster resolution with logging + escalation
#1
prompt injection — fastest-growing vector
Most
had no pre-deployment output contract

The most consistent pattern is the concentration of failures in PSF domains 1 and 2. Input governance failures — unvalidated user inputs reaching the model — and output validation failures — model outputs consumed downstream without schema validation — together account for the majority of preventable failures observed across assessments. Both are addressable with straightforward engineering controls.

Incident distribution by PSF domain

PSF-1 Input Governance~34% of failures

Prompt injection attacks, unvalidated inputs causing off-topic or harmful outputs

PSF-2 Output Validation~34% of failures

Malformed JSON consumed downstream, hallucinated values in structured fields, schema drift after model update

PSF-5 Deployment Safety~13% of failures

Silent model version change causing output format regression, lack of rollback on failed deployment

PSF-6 Human Oversight~11% of failures

Autonomous action taken beyond intended scope, escalation path not reached on time-sensitive decisions

PSF-4 Observability~4% of failures

Failures not detected for >48 hours due to absent or incomplete logging

PSF-8 Vendor Resilience~4% of failures

Provider API outage with no fallback, unexpected model deprecation without migration path

The prompt injection problem

Prompt injection — the technique by which adversarial content in user inputs or retrieved documents causes the model to deviate from its intended behaviour — is consistently among the most common failure categories observed, and represents the fastest-growing vector in public reporting.

In the majority of injection cases, the attack vector is retrieved content: documents, emails, or web pages that the system is asked to summarise or process contain embedded instructions. A smaller proportion involve direct user manipulation of the prompt structure.

Pattern:Systems that process third-party content — documents, emails, web pages — require explicit injection defence that treats retrieved content as untrusted data, separate from the system prompt. Systems that do not implement this separation are consistently exploitable.

What resolves incidents fastest

Time-to-resolution correlates most strongly with two factors: whether the system had observable logging (PSF-4) and whether a human escalation path existed (PSF-6). Systems with both in place resolve failures significantly faster — the gap between prepared and unprepared systems is consistently greater than 4× in reported cases.

The intervention pattern most associated with fast resolution is the existence of a named incident owner — a person whose responsibility includes monitoring the system — combined with alerting thresholds configured before deployment. The majority of reported failures lack both conditions at the time of the incident.

Methodology and limitations

This analysis draws on three sources: patterns observed by PAI assessors during CPAP and CPAA portfolio reviews; publicly reported post-mortems and disclosed incidents from AI practitioners and organisations; and practitioner-submitted case notes contributed voluntarily to PAI. All case material is treated as confidential and no identifying information is included.

The proportional breakdowns in this paper reflect relative frequency across the observed corpus, not a statistically random sample. They should be interpreted as directional indicators of where practitioners are most likely to encounter failures — not as precise population estimates. Cases of severe or ongoing failures are likely under-represented, as practitioners are less able to share them.

Published by the Production AI Institute, June 2025. Licensed CC BY 4.0.

Related: Production Safety Framework · Seven failure modes (Insights) · Submit an incident report