A vendor-neutral framework defining what it means for a production AI deployment to be safe, auditable, and resilient. Eight domains. 40 criteria. Applicable to any model, any stack, any cloud.
The PSF is the assessment basis for CPAP and CPAA certification. Organisations can self-assess against the PSF or commission a formal PAI Deployment Assessment. The framework is published openly — cite it, reference it, build on it.
Every input reaching an AI model must be validated, sanitised, and treated as untrusted.
All user-supplied input is sanitised before model processing. Injection patterns are detected and handled.
Input schemas are validated: type, length, and structure are checked before the model call.
Inputs are logged with PII redacted. The system maintains an audit-safe record of what the model received.
Rate limiting is applied to all AI endpoints. Abuse patterns trigger alerts, not silent failures.
The system prompt is treated as a security boundary. User content cannot override system-level instructions through normal operation.
Raw model output is never trusted. Every output is validated before it acts on any system.
Structured output schemas (JSON/XML) are enforced. The model is configured to return parseable, schema-conformant responses.
Schema validation runs on every response before downstream use. Validation failures trigger a defined fallback, not a crash.
Semantic validation is applied where possible: outputs are checked for coherence and completeness beyond structural correctness.
No raw model output is passed directly to a database, external API, or user interface without processing.
Output quality is scored automatically on a sample basis. Scores are tracked over time to detect degradation.
Personal and sensitive data is protected throughout the AI pipeline, not just at rest.
PII is identified, redacted or tokenised before it reaches any external AI model API. The model never processes raw personal identifiers unless strictly necessary with documented justification.
The legal basis for AI processing of personal data is documented and reviewed. GDPR Article 6 (and Article 9 for special categories) requirements are met. Where the deployment is in scope of the EU AI Act, the relationship between personal-data processing and deployer/provider obligations under the Act is documented and reviewed on a defined cadence.
Data minimisation is enforced: only data strictly necessary for the AI task is included in prompts and retrieved context.
Data retention policies apply to AI inputs, outputs, and logs. PII in logs is purged on schedule.
Vector databases containing embedded personal data are treated with the same access controls and data lifecycle policies as primary databases.
You cannot manage what you cannot measure. Every AI system in production must be observable.
Every inference call is logged with: request ID, timestamp, model name and version, input token count, output token count, latency (ms), and outcome classification. Where requests traverse multiple models or chained calls, each hop is attributed in logs (correlated request ID, model identifier per hop).
Automated output quality scoring runs on a statistically significant sample of responses. Results are stored and trended.
Drift monitoring is active: alerts fire when input distribution or output quality deviates beyond defined thresholds, including when degradation localises to a single stage in a multi-model pipeline.
User feedback signals (explicit ratings, implicit corrections, escalations) are captured and correlated with model outputs.
A dashboard or reporting mechanism exists that shows AI system health in near-real-time. On-call engineers can assess AI system status without accessing raw logs.
Every model change is a risk. Production AI deployments require the same rigour as critical software.
All model changes (new model, new version, new system prompt) go through a defined deployment pipeline with evaluation gate.
Canary or shadow deployment is used for significant changes. Traffic is split or the new model runs in parallel before full cutover.
Rollback to the previous model/configuration is possible within 15 minutes. The rollback procedure is documented and tested.
An evaluation suite (evals) runs automatically against every model change. Deployment is blocked if evals regress beyond threshold.
Model versions are tracked. For any output, the exact model name, version, and system prompt hash that produced it can be determined from logs.
Automation does not mean unaccountable. High-stakes AI decisions require human checkpoints.
All AI decisions in the system are classified by stakes (low/medium/high) and reversibility (reversible/irreversible). This classification is documented.
High-stakes and irreversible decisions have human-in-the-loop checkpoints. The system cannot bypass these checkpoints under normal operation. For agentic or tool-using workflows, bounded autonomy is defined (maximum consecutive autonomous actions, approval gates for irreversible tools, kill-switch or circuit-stop behaviour) and tested.
Escalation paths exist for AI failures and edge cases. When the AI cannot handle a situation, it escalates to a defined human owner — it does not silently fail or guess.
Human overrides of AI decisions are logged. Override patterns are reviewed regularly to identify systematic AI failures.
Users of AI-assisted decisions are informed they are interacting with an AI system, where legally required (EU AI Act Article 52) or where material to their decision-making.
AI systems introduce new attack surfaces. Standard security practices must extend to cover them.
All AI API keys and credentials are stored in a secrets manager (AWS Secrets Manager, Azure Key Vault, HashiCorp Vault, or equivalent). No credentials in source code, version control, or environment variable files committed to repositories.
Separate API keys are used per environment (development, staging, production) with appropriate spending limits on each.
The AI system's attack surface is documented: all external model calls, data sources, and tool integrations are mapped. This map is reviewed when the system changes.
Prompt injection is treated as a security vulnerability class. The system has documented defences, and injection testing is part of the security review process.
For agentic systems: the principle of minimal authority applies. Agents have only the permissions required for their task. Tool calls are validated before execution. Irreversible actions require confirmation.
A production AI system that only works with one vendor is a liability, not an asset.
The AI model interface is abstracted behind an internal API layer. Switching the underlying model requires a configuration change, not a code rewrite.
The system has been tested with at least one alternative model. Prompt compatibility with alternatives is assessed before being required.
Vendor dependencies are documented: model providers, embedding providers, vector database providers. Each has a documented contingency (alternative provider or self-hosted fallback).
The system does not use vendor-specific features in core application logic without documented justification and a migration plan.
Vendor SLA and data processing terms are reviewed and on file. The organisation knows what happens to their data when it is sent to each AI provider. Regulatory and jurisdictional posture (including EU AI Act provider/deployer allocations where relevant) is recorded alongside commercial terms and reviewed on change.
The PAI Production Safety Framework is published under Creative Commons CC BY 4.0. You may share and adapt the framework with attribution. For historical work assessed against PSF v1.0 (2024), cite that version and year; see how to cite.