The professional standard for production AI deployment
Verify a credentialFor organisationsPartner NetworkFor nonprofits & NGOsContact
Incident Response

AI incident response playbook

AI incidents are not only outages. They include unsafe outputs, silent drift, data exposure, bad automated decisions, tool abuse, and vendor failures. This playbook gives teams the first operating rhythm.

Hallucination causing customer harm
PII or secret exposure
Unsafe autonomous action
Quality drift after model change
Prompt injection or tool abuse
Vendor outage or deprecation
0-4 hours

Contain

  1. Pause the affected AI workflow, model route, tool permission, or automation trigger.
  2. Preserve prompts, outputs, traces, logs, evaluation scores, user reports, and vendor status evidence.
  3. Assign one incident owner and one communications owner.
  4. Classify whether the event may involve personal data, regulated decisions, financial loss, security exposure, or customer harm.
4-24 hours

Assess

  1. Identify the affected system, deployment version, model, prompt, retrieval corpus, tool permissions, and user population.
  2. Estimate blast radius: users affected, decisions made, records changed, messages sent, and downstream systems touched.
  3. Determine whether legal, privacy, contractual, or regulator notification clocks have started.
  4. Decide whether rollback, vendor failover, manual review, or customer correction is required before restoration.
24-72 hours

Remediate

  1. Patch the control that failed: validation, retrieval filters, approval gates, evals, permissions, monitoring, or fallback.
  2. Re-run regression tests and targeted evals using incident examples and adjacent edge cases.
  3. Review any AI-generated customer, employee, financial, legal, or operational decisions affected by the failure.
  4. Document customer/regulator communications and evidence supporting the remediation decision.
7-14 days

Learn

  1. Run a post-incident review focused on system controls, not individual blame.
  2. Map root cause to PSF domains and update the deployment checklist.
  3. Add the incident or near-miss to the internal AI incident register.
  4. Update partner/vendor assessment records if a third-party dependency contributed.

Minimum AI incident record

Record the system, model, prompt/version, data source, user population, triggering input, observed output/action, downstream impact, containment decision, remediation owner, and PSF domains implicated.

Browse AI incidentsCAIS safety credential