The professional standard for production AI deployment

Verify a credential For organisations MSP Programme For MSPs & partners Nonprofits & NGOs Contact

GuideMay 2026

21 Agentic Design Patterns
A Complete Guide for Business Professionals

Every production AI system is built from a small set of reusable patterns. This guide explains all 21 — from the basics to enterprise-scale architecture — in plain language, without a single line of code.

Production AI Institute · Based on the Agentic Design Patterns open-source book (github.com/evoiz/Agentic-Design-Patterns) · Published May 2026 · CC BY 4.0

Why patterns matter

The most common mistake in enterprise AI is treating every deployment as a unique engineering problem that requires inventing the solution from scratch. It is not. The same architectural patterns appear across every industry, every function, and every use case. Once you recognise them, you can apply them deliberately — and avoid the failures that come from applying them accidentally.

These 21 patterns come from the open-source book Agentic Design Patterns by the evoiz team, which covers agent architecture from beginner to enterprise level. Each pattern comes with a Jupyter Notebook for those who want to run code. This guide is for everyone else — the domain experts, the business analysts, the managers, and the operators who need to understand what they are deploying without necessarily implementing it themselves.

Each pattern is mapped to a PSF domain — the Production Safety Framework domain most relevant to getting that pattern right in production.

Part 1: Foundation Patterns

The seven patterns that appear in almost every production AI system. If you understand nothing else, understand these.

Prompt Chaining

PSF D2

Breaking a complex task into a sequence of smaller prompts where each output feeds the next input. Like an assembly line — each station does one thing well, and the product moves forward.

In practice

Draft email → fact-check claims → adjust tone → format → send. Each step is a separate model call, reviewed at each stage.

Why it matters: Easier to debug, audit, and improve than one giant prompt that does everything.

Routing

PSF D5

A classifier that reads an input and decides which specialist agent or pipeline should handle it. The dispatcher of the agent world.

In practice

A customer query arrives. The router classifies it as 'billing dispute', 'technical support', or 'general enquiry' and sends it to the right downstream agent.

Why it matters: Different task types need different agents, tools, and safety guardrails. Routing ensures the right tool handles the right job.

Parallelism

PSF D4

Running multiple agent tasks simultaneously rather than sequentially, then combining the results. What takes 10 minutes sequentially takes 1 minute in parallel.

In practice

Research an acquisition target: one agent analyses financials, another reviews press coverage, a third checks regulatory filings — all at the same time.

Why it matters: For tasks where sub-questions are independent, parallelism is the difference between an agent that takes 90 minutes and one that takes 9.

Reflection

PSF D2

An agent that reviews and critiques its own output, then revises it. A built-in quality check loop before the result reaches a human.

In practice

Agent drafts a legal clause. Reflection prompt: 'Review this clause for ambiguity, enforceability, and consistency with the contract defined so far.' Agent revises.

Why it matters: Catching errors before output reaches humans is always cheaper than catching them after. Reflection adds a layer of automated review.

Tool Calling

PSF D1

The pattern that turns a language model from a text generator into an actor. The model can invoke external tools — search the web, query a database, run code, send an email — and use the result to continue.

In practice

Agent is asked for the current share price. It calls a market data tool, gets the result, and incorporates it into its response.

Why it matters: Without tool calling, agents are limited to their training data. With it, they can act on live information.

Multi-Agent Collaboration

PSF D6

Multiple specialised agents working together, each owning a domain of the overall task, coordinated by an orchestrator.

In practice

An onboarding orchestrator coordinates: an HR agent (contracts), an IT agent (provisioning access), a facilities agent (desk allocation), a payroll agent (setup).

Why it matters: Specialisation produces better outputs. Coordination makes the specialisation useful. This is the dominant pattern in enterprise AI.

Orchestration

PSF D6

The meta-pattern: a controlling agent that directs sub-agents, manages state, handles failures, and decides when the overall task is complete.

In practice

The orchestrator for a contract review workflow: assign clauses to specialist agents, collect findings, request human review of flagged items, compile final report.

Why it matters: Without orchestration, multi-agent systems produce uncoordinated outputs. With it, they behave as coherent systems.

Part 2: Production Patterns

The patterns that determine whether an agent works in a demo or works in production — every day, at scale, without supervision.

Memory Management

PSF D3

How an agent stores and retrieves information across sessions, tools, and agent boundaries. The difference between an agent that forgets everything at the end of a conversation and one that builds an accurate model of your business over time.

In practice

A sales agent that remembers every customer interaction, the state of every deal, and the preferences of every contact — without needing to be told again each time.

Why it matters: Memory is the accumulation of value. An agent that cannot remember is an agent that cannot improve.

Exception Recovery

PSF D5

How an agent detects that something has gone wrong and decides what to do about it: retry, escalate, skip, or fail gracefully.

In practice

Agent calls an API that returns an error. Recovery logic: retry once with exponential backoff; if still failing, flag for human review and continue with remaining tasks.

Why it matters: Production AI systems will fail. The question is whether they fail gracefully or catastrophically. Exception recovery is the difference.

Human-AI Collaboration

PSF D6

The patterns for deciding when an agent should act autonomously and when it should pause and involve a human. The checkpoint architecture of production AI.

In practice

An agent processes expense reports autonomously below $500. Above $500, it prepares a summary and waits for manager approval before proceeding.

Why it matters: Fully autonomous systems are not appropriate for all tasks. The human-in-the-loop design pattern is a safety architecture, not a limitation.

Safety Guardrails

PSF D1

The input and output filters that prevent agents from receiving or producing content they should not. The bouncer at the door of every agent interaction.

In practice

Input guardrail: reject any prompt that contains PII. Output guardrail: before sending any external communication, strip content that matches the organisation's confidential data pattern.

Why it matters: Guardrails are the difference between a contained incident and a regulatory breach. They belong in architecture, not in the prompt.

Performance Evaluation

PSF D4

Systematic measurement of whether an agent is actually doing what it should. Not just 'does it run' but 'does it produce the right outputs at the right quality level'.

In practice

A contract review agent is evaluated weekly: what percentage of flagged issues were confirmed by human review? What percentage of approved contracts had issues that should have been flagged?

Why it matters: Without measurement, agent performance drifts. With it, you can improve it — and prove to a regulator or auditor that you are.

Context Window Management

PSF D3

Strategies for fitting the information an agent needs into the finite context it can process. What to include, what to summarise, what to retrieve on demand.

In practice

A document review agent that summarises prior sections before processing new ones, rather than trying to fit a 100-page contract into a single context.

Why it matters: Context limits are not going away. The agent operator who understands how to manage context is the one whose agents produce coherent outputs on long tasks.

Retrieval-Augmented Generation (RAG)

PSF D3

Connecting an agent to an external knowledge base so it can retrieve relevant information at inference time rather than relying on training data alone.

In practice

A policy Q&A agent connected to your company's internal policy library. It retrieves the relevant policies before answering, rather than hallucinating what it thinks they say.

Why it matters: Training data is historical and generic. Your organisation's knowledge is current and specific. RAG bridges that gap.

Part 3: Enterprise Patterns

Advanced patterns for organisations running multiple agents at scale, often in regulated environments.

Event-Driven Agents

PSF D5

Agents triggered by events in your systems rather than by direct user prompts. The agent that wakes up when something happens, rather than when someone asks.

In practice

A monitoring agent that triggers when a server response time exceeds 2 seconds — investigates, diagnoses, and either resolves autonomously or escalates with a drafted incident report.

Why it matters: The most valuable agents are the ones running when you are not watching. Event-driven architecture makes agents reactive to the world rather than reactive to prompts.

Feedback Loops

PSF D4

Architectures that route the outputs of one agent cycle back as inputs to improve the next cycle. The flywheel of autonomous systems.

In practice

A content agent that tracks which articles generate the most engagement. Higher-engagement patterns are fed back as guidance for the next content generation cycle.

Why it matters: Feedback loops are how agents improve without manual intervention. Without them, an agent repeats the same mistakes indefinitely.

Swarm Intelligence

PSF D6

Many simple agents working in parallel on variations of the same problem, with a synthesis layer that combines the best outputs.

In practice

A competitive analysis: 20 agents each analyse one competitor. A synthesis agent identifies the patterns across all 20 and produces a consolidated strategic summary.

Why it matters: For problems where variety and coverage matter more than depth from a single source, swarm patterns outperform single-agent approaches.

Hierarchical Agents

PSF D6

An agent hierarchy where strategic agents direct tactical agents that direct operational agents. The org chart pattern applied to AI systems.

In practice

A CEO agent sets quarterly priorities. Department agents translate priorities into weekly tasks. Task agents execute individual work items. Results flow back up.

Why it matters: Complex enterprise processes naturally decompose hierarchically. The patterns that work for human organisations also work for agent organisations.

Self-Improvement Agents

PSF D7

Agents that analyse their own performance and propose changes to their own prompts, tools, or workflows — with human approval before any change is applied.

In practice

A support agent that reviews conversations where users were unsatisfied and proposes updated response strategies for the agent operator to review and approve.

Why it matters: This pattern accelerates improvement but requires strict human oversight. A self-improving agent without a review gate is a security risk.

Debate and Verification

PSF D2

Two or more agents take opposing positions on a question, then a third evaluates the debate and produces a verified conclusion.

In practice

Agent A argues for executing a trade. Agent B argues against. Verification agent assesses both arguments and recommends proceed/reject/escalate.

Why it matters: For high-stakes decisions, adversarial review produces more reliable outputs than a single agent answering the question once.

Curriculum-Based Learning

PSF D4

Agents that are tested against progressively harder evaluation sets, with the difficulty level dynamically adjusted based on performance.

In practice

A medical coding agent trained against progressively more complex cases, with the difficulty automatically advanced when accuracy on current cases exceeds 95%.

Why it matters: Static evaluation misses the edge cases at the frontier of capability. Curriculum patterns keep agents challenged at the right level.

PSF domain mapping

The 21 patterns distribute across the PSF domains as follows. A well-architected production AI system will address all 8 domains.

D1 Input Governance

Tool Calling, Safety Guardrails

D2 Output Validation

Reflection, Debate & Verification, Prompt Chaining

D3 Data Protection

Memory Management, Context Management, RAG

D4 Observability

Performance Evaluation, Feedback Loops, Curriculum

D5 Deployment Safety

Routing, Exception Recovery, Event-Driven

D6 Human Oversight

Human-AI Collaboration, Multi-Agent, Orchestration, Hierarchical, Swarm

D7 Security

Self-Improvement (with human gate)

D8 Vendor Resilience

Applies to all external tool calls

Governance is a pattern too

Every pattern above has a governance implication. In regulated environments — finance, healthcare, government — the question is not just "does this pattern work?" but "can you prove it works, and can an auditor verify it?" The CAIG and CAIAUD credentials certify that ability.

CAIG — AI Governance →CAIAUD — AI Auditor →

NEWPAI Pattern Library

Deep-dive pages for all 21 patterns

We've built a dedicated page for each pattern — with PSF domain alignment, PAI-8 control mappings, production failure modes, implementation checklists, and certification relevance. Bookmark the full library.

Browse Pattern Library →Start: Prompt Chaining

From reading to credential

You understand the gaps.
Get the credential that proves it.

The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.

Start AIDA — free →CPAP practitioner credential

The Production AI Brief