Published: 2026-04-29 · License: CC BY 4.0
Domain: PSF-1 — Input Governance
Input Governance
Every production AI system has an attack surface that starts with its input layer. Before a model processes a single token, the input must be validated, sanitised, classified by intent, and checked against policy. PSF-1 governs this boundary — the point at which the outside world meets your AI infrastructure.
Why Input Governance Fails
Most AI security incidents begin at the input boundary. Prompt injection — where malicious instructions are embedded in user-supplied content — is now the most documented attack class against production LLM systems. A system without input governance trusts the user to supply well-formed, benign input. In any real-world deployment, that assumption is wrong. Inputs arrive from web forms, API integrations, document uploads, email content, and user-controlled data sources. Each of these is a vector.
The Input Governance Stack
Before any model call, validate that the input conforms to the expected structure and type. Reject anything that falls outside the defined schema with a structured error response — not a model-generated fallback.
Classify inputs for adversarial intent: role-override attempts ('ignore all previous instructions'), delimiter attacks, indirect injection from user-controlled content, and jailbreak pattern matching. Block or escalate before inference.
Apply content policy checks at the input stage — not only at the output stage. Some inputs should never reach the model at all. Category filtering on harmful content, PII redaction, and topic scope enforcement all belong here.
For multi-turn or multi-purpose systems, classify the intent of each input against the system's permitted use cases. A document summarisation system that accepts open-ended instructions will drift toward misuse without intent guardrails.
Abuse at scale often starts with volumetric attacks: repeated variations of an adversarial prompt until one gets through. Per-user and per-session rate limiting is part of input governance, not just operations.
In LLM-based systems, the system prompt establishes the operating context. User input must be clearly delimited from system instructions. Systems that concatenate them without structural separation are vulnerable to prompt injection by design.
Direct vs. Indirect Prompt Injection
Direct prompt injection is when the user themselves includes adversarial instructions in their input. Indirect prompt injection is more dangerous: malicious instructions are embedded in content the AI is asked to process — a document, a web page, an email body, a database record. The AI follows the embedded instruction as if it were a system command. Indirect injection is the primary attack vector for autonomous AI agents that process third-party content. Defence requires treating all third-party content as untrusted data, never as instruction.
PSF-1 Compliance Checklist
Common PSF-1 Failures in Production
- Customer-facing chatbot leaks system prompt because user input was concatenated directly with system instructions without delimiters
- Document processing agent executes embedded instructions in uploaded PDFs, treating document content as system commands
- Multi-modal input system validates text but not image metadata — adversarial content embedded in EXIF data bypasses all text-layer filtering
- Rate limiting implemented at the API gateway but not enforced per-user, allowing a single account to run distributed injection attacks
- Input validation applied in the web UI but not at the API layer — mobile app and direct API consumers bypass all guardrails
AIDA Exam Tips for PSF-1
- PSF-1 questions often present a scenario and ask you to identify the specific input governance failure. Focus on: was it injection? Was it missing schema validation? Was it system prompt isolation failure?
- Distinguish direct vs. indirect injection — indirect (from third-party content) is considered the harder problem and is tested more heavily.
- Input governance happens BEFORE the model call. If a question describes a fix that involves changing the model output, it is a PSF-2 (Output Validation) answer, not PSF-1.
- Rate limiting belongs in PSF-1 when the context is preventing abuse at the input boundary. It appears in PSF-7 (Security) when the context is DoS prevention.
- The answer 'validate inputs against a schema' is almost always correct for PSF-1 scenario questions.