Eight real-world scenarios, one per PSF domain. Each case walks through what went wrong, why it went wrong, and exactly how to fix it. Read these instead of flashcards.
Each scenario below is backed by a complete PSF deep dive guide — implementation patterns, framework comparisons, and production case studies.
Practice exam with instant feedback, then take the real AIDA when you're ready.
Eight real-world audit vignettes — one per PAI-8 control. Each case presents an organisational scenario, the auditor's analysis, evidence requirements, and findings classification. These cases map directly to the 30-question CAIA exam.
An energy company has a published AI ethics policy, an AI Ethics Officer role on the org chart, and a quarterly AI risk meeting. During your interview, the CISO says they own all AI risk. The CTO says AI governance is a technical matter owned by the AI team. The Legal team says they signed off the policy but have not attended any risk meeting. You find no decision gate records showing AI governance applied to any deployment decision in the past 12 months.
C1 maturity: L1 (Basic). Policy exists and a role is assigned, but governance is not applied to real decisions. L2 requires documented decision gate records.
A governance framework that does not produce evidence of decisions being made is an L1 finding regardless of how many committees exist. Look for records, not org charts.
A healthcare provider completes a formal AI risk assessment each January covering all deployed AI systems. In March they upgrade their patient triage model to a new foundation model. In July they expand its use from emergency triage to routine appointment prioritisation. In October you audit them. They present the January risk assessment as evidence of C2 compliance.
C2 maturity: L1. Annual assessment exists but trigger-based reassessment is absent. Model upgrade and use case expansion are both explicit PAI-8 C2 triggers.
Annual cadence is the minimum baseline. Trigger-based reassessment is the differentiator between L1 and L2. Always check whether model changes or use case expansions occurred after the last assessment.
A SaaS company fine-tuned a support model on 3 years of customer support emails. The emails contained PII, occasional health disclosures, and financial information. The data team confirms no consent was obtained for model training use. The model is now in production. When asked for a data lineage document, the data team provides a pipeline diagram showing how data flows into training — but no record of what data was included, excluded, or reviewed.
C3 maturity: L1. Training data exists and was used, but provenance, consent, and documentation are absent. L2 requires documented provenance and consent/lawful basis.
A pipeline diagram is not a data stewardship record. Look for consent documentation, exclusion records, and evidence that the organisation actually reviewed what went into training.
A financial services firm deploys a credit scoring model. They present benchmark results from three academic datasets and an internal A/B test showing 94% accuracy. In production, the model disproportionately denies credit to applicants from two postcodes, which correlates strongly with ethnicity. No bias testing was conducted pre-deployment. The model was approved by the AI team without an independent review.
C4 maturity: L1. Pre-deployment testing occurred but lacked bias evaluation and independent review. Deployment without fairness testing in a high-risk use case is a C4 critical finding.
Accuracy metrics do not substitute for fairness evaluation. For high-risk use cases, look for evidence of protected characteristic testing and independent review.
A logistics company deploys a route optimisation agent that makes autonomous delivery reassignment decisions. The system has a documented human override procedure in the operations manual. During stakeholder interviews, three of four operations managers are unaware an override is possible. The fourth found it 'by accident'. No override has ever been logged. The system has been live for 18 months.
C5 maturity: L1. Human oversight mechanism exists on paper but is not operational. L2 requires staff awareness, training records, and logged override events.
An override nobody knows about is not human oversight — it is documentation. The test is whether a human can actually intervene, not whether an override exists in a manual.
A retailer's recommendation engine begins surfacing inappropriate product combinations to users — recommending items frequently associated with self-harm alongside unrelated products. The issue persisted for 6 hours before being noticed. It was logged in the IT incident management system as a 'software defect — recommendation algorithm' with a 3-day response SLA. No post-incident review was conducted. No regulators were notified.
C6 maturity: L0–L1. No AI-specific incident classification exists. The incident was handled under a generic IT process that is structurally inadequate for AI harm events.
AI incidents do not map cleanly to software defect categories. Without an AI-specific incident taxonomy, high-severity AI harm events will be systematically mis-classified and under-responded to.
A benefits agency uses an AI to assess eligibility for housing assistance. An applicant challenges a denial. The agency's legal team requests the reasoning behind the specific decision. The AI team can produce the model version number but cannot reconstruct: which data was used for this specific applicant, what the model's intermediate reasoning was, or why this decision differed from similar applicants. Logs are retained for 30 days; the decision was made 45 days ago.
C7 maturity: L1. Some logging exists (model version) but decision-level audit trail is inadequate for the use case. In regulated contexts, this is a critical finding.
A model version number is not an audit trail. Auditable AI requires decision-level logging: who was assessed, with what data, by which model, producing what output. Retention must outlast the challenge window.
A legal tech company's contract analysis product is built on a third-party model API. The API provider announces a 30-day deprecation of the model version in use. The legal tech company has no contractual minimum notice period, no alternative model evaluated, and no documented continuity plan. Their vendor contract requires 99.9% uptime but has no specific provision for model version deprecation. Their largest customer's contract requires 90-day service continuity notice.
C8 maturity: L1. A vendor exists and is being used, but vendor risk is unmanaged. No continuity planning, no contractual protections, no inventory of dependencies.
API availability SLAs and model version continuity SLAs are different things. An AI vendor can be 100% uptime compliant while simultaneously withdrawing the model you depend on. C8 requires specific contractual protections for model continuity.
30 scenario-based questions covering all 8 PAI-8 controls. Pass threshold: 22/30. Exam fee: $97. Credential valid on the PAI registry upon passing.