Production AI Institute · Independent certification for production AI practice
Verify a credential|Contact|

Insights / PSF / Domain Guide

Production AI Institute — PSF Domain Guide v1.0
Published: 2026-04-29 · License: CC BY 4.0
Domain: PSF-1 — Input Governance
PSF-1

Input Governance

Every production AI system has an attack surface that starts with its input layer. Before a model processes a single token, the input must be validated, sanitised, classified by intent, and checked against policy. PSF-1 governs this boundary — the point at which the outside world meets your AI infrastructure.

Why Input Governance Fails

Most AI security incidents begin at the input boundary. Prompt injection — where malicious instructions are embedded in user-supplied content — is now the most documented attack class against production LLM systems. A system without input governance trusts the user to supply well-formed, benign input. In any real-world deployment, that assumption is wrong. Inputs arrive from web forms, API integrations, document uploads, email content, and user-controlled data sources. Each of these is a vector.

The Input Governance Stack

Schema validation

Before any model call, validate that the input conforms to the expected structure and type. Reject anything that falls outside the defined schema with a structured error response — not a model-generated fallback.

Prompt injection detection

Classify inputs for adversarial intent: role-override attempts ('ignore all previous instructions'), delimiter attacks, indirect injection from user-controlled content, and jailbreak pattern matching. Block or escalate before inference.

Content policy pre-filtering

Apply content policy checks at the input stage — not only at the output stage. Some inputs should never reach the model at all. Category filtering on harmful content, PII redaction, and topic scope enforcement all belong here.

Intent classification

For multi-turn or multi-purpose systems, classify the intent of each input against the system's permitted use cases. A document summarisation system that accepts open-ended instructions will drift toward misuse without intent guardrails.

Rate limiting and quota enforcement

Abuse at scale often starts with volumetric attacks: repeated variations of an adversarial prompt until one gets through. Per-user and per-session rate limiting is part of input governance, not just operations.

System prompt isolation

In LLM-based systems, the system prompt establishes the operating context. User input must be clearly delimited from system instructions. Systems that concatenate them without structural separation are vulnerable to prompt injection by design.

Direct vs. Indirect Prompt Injection

Direct prompt injection is when the user themselves includes adversarial instructions in their input. Indirect prompt injection is more dangerous: malicious instructions are embedded in content the AI is asked to process — a document, a web page, an email body, a database record. The AI follows the embedded instruction as if it were a system command. Indirect injection is the primary attack vector for autonomous AI agents that process third-party content. Defence requires treating all third-party content as untrusted data, never as instruction.

PSF-1 Compliance Checklist

All inputs validated against a defined schema before model call
Prompt injection detection implemented at the input boundary
User input structurally separated from system prompt in all LLM calls
Content policy pre-filtering applied before inference (not only post)
PII identification and redaction applied to user inputs before logging
Intent classification implemented for multi-purpose systems
Per-user rate limiting in place with tested enforcement
Indirect injection defence: all third-party content treated as untrusted
Rejection handling defined: invalid inputs return structured errors, not model output
Input validation logs retained for audit and incident investigation

Common PSF-1 Failures in Production

  • Customer-facing chatbot leaks system prompt because user input was concatenated directly with system instructions without delimiters
  • Document processing agent executes embedded instructions in uploaded PDFs, treating document content as system commands
  • Multi-modal input system validates text but not image metadata — adversarial content embedded in EXIF data bypasses all text-layer filtering
  • Rate limiting implemented at the API gateway but not enforced per-user, allowing a single account to run distributed injection attacks
  • Input validation applied in the web UI but not at the API layer — mobile app and direct API consumers bypass all guardrails

AIDA Exam Tips for PSF-1

  • PSF-1 questions often present a scenario and ask you to identify the specific input governance failure. Focus on: was it injection? Was it missing schema validation? Was it system prompt isolation failure?
  • Distinguish direct vs. indirect injection — indirect (from third-party content) is considered the harder problem and is tested more heavily.
  • Input governance happens BEFORE the model call. If a question describes a fix that involves changing the model output, it is a PSF-2 (Output Validation) answer, not PSF-1.
  • Rate limiting belongs in PSF-1 when the context is preventing abuse at the input boundary. It appears in PSF-7 (Security) when the context is DoS prevention.
  • The answer 'validate inputs against a schema' is almost always correct for PSF-1 scenario questions.

Certifications that assess PSF-1

AIDA ExaminationCAIS — AI Safety SpecialistCPAP Portfolio
Full PSF FrameworkStudy GuidePractice Exam