Incident Response

AI incident response playbook

AI incidents are not only outages. They include unsafe outputs, silent drift, data exposure, bad automated decisions, tool abuse, and vendor failures. This playbook gives teams the first operating rhythm.

Hallucination causing customer harm

PII or secret exposure

Unsafe autonomous action

Quality drift after model change

Prompt injection or tool abuse

Vendor outage or deprecation

0-4 hours

Contain

Pause the affected AI workflow, model route, tool permission, or automation trigger.
Preserve prompts, outputs, traces, logs, evaluation scores, user reports, and vendor status evidence.
Assign one incident owner and one communications owner.
Classify whether the event may involve personal data, regulated decisions, financial loss, security exposure, or customer harm.

4-24 hours

Assess

Identify the affected system, deployment version, model, prompt, retrieval corpus, tool permissions, and user population.
Estimate blast radius: users affected, decisions made, records changed, messages sent, and downstream systems touched.
Determine whether legal, privacy, contractual, or regulator notification clocks have started.
Decide whether rollback, vendor failover, manual review, or customer correction is required before restoration.

24-72 hours

Remediate

Patch the control that failed: validation, retrieval filters, approval gates, evals, permissions, monitoring, or fallback.
Re-run regression tests and targeted evals using incident examples and adjacent edge cases.
Review any AI-generated customer, employee, financial, legal, or operational decisions affected by the failure.
Document customer/regulator communications and evidence supporting the remediation decision.

7-14 days

Learn

Run a post-incident review focused on system controls, not individual blame.
Map root cause to PSF domains and update the deployment checklist.
Add the incident or near-miss to the internal AI incident register.
Update partner/vendor assessment records if a third-party dependency contributed.

Minimum AI incident record

Record the system, model, prompt/version, data source, user population, triggering input, observed output/action, downstream impact, containment decision, remediation owner, and PSF domains implicated.

Browse AI incidents CAIS safety credential