Two agents take opposing positions; a third evaluates the debate and produces a verified conclusion.
Debate and verification applies the logic of adversarial review to AI-generated outputs. For high-stakes decisions, rather than one agent producing a single answer, two agents argue opposing positions and a verification agent evaluates the quality of both arguments to reach a more reliable conclusion.
The pattern has three roles. The proposing agent argues for a particular conclusion: it should approve the contract, the transaction presents elevated risk, the diagnosis fits the presented symptoms. The opposing agent argues against: it should not approve, the risk is within tolerance, the diagnosis doesn't account for these alternative explanations. Both agents are required to argue their assigned position regardless of their own prior outputs — the goal is adversarial rigour, not genuine disagreement. The verification agent reads both arguments and evaluates them on quality of reasoning and grounding in evidence, not on which position it initially agrees with. For highest-stakes decisions, the verification step may be performed by a human rather than another agent.
A venture capital firm uses debate and verification for investment decisions above a defined threshold. When a deal reaches the term sheet stage, two agents are instantiated: one is instructed to construct the strongest possible investment case, the other to construct the strongest possible case for declining. Both produce structured arguments covering market, team, financials, and risk. An investment committee agent reads both arguments and produces a structured assessment: where both sides agree (facts), where they disagree (interpretations), and the strength of each side's reasoning. This assessment, along with both arguments in full, goes to the investment committee as briefing material.
Single-agent outputs, even when high quality, represent one perspective. For consequential decisions — investments, clinical recommendations, legal positions, major contracts — the cost of a wrong decision justifies the overhead of adversarial review. Debate and verification systematically surfaces the strongest counterarguments before a decision is made, rather than discovering them after.
How this pattern fails in practice — and what to watch for.
Both debating agents converge on the same position after minimal exchange — one conceding too quickly. This happens when the agents' internal tendency is to agree rather than to argue their assigned position. The debate produces false consensus without genuine adversarial pressure.
The opposing agent constructs weak arguments against a clearly correct position — effective at creating the appearance of debate while ensuring the proposing agent wins. This typically happens when the same model generates both sides, and its implicit prior strongly favours one position.
The verification agent has a systematic bias toward arguments presented in a particular format or style, or toward positions that align with its training data's dominant viewpoints. It consistently rules in favour of the proposing agent regardless of the actual quality of the opposition's arguments.
Seven things to verify before deploying this pattern in production.
Debate and verification is an advanced topic covered in CAIG and CAIAUD. It appears in CAIG in the context of decision governance for high-stakes AI outputs. CAIAUD auditors are tested on their ability to assess the rigour of a debate architecture: is the opposition genuinely adversarial, or is it structured to confirm the default position? This is one of the most nuanced audit competencies in the PAI curriculum.
The AIDA certification covers all 21 agentic design patterns with a focus on deployment safety, governance, and the PSF. Free to attempt.