This guide covers all domains tested in the CAAE examination. Each domain includes key concepts, a worked scenario, and the reasoning approach examiners expect.
Your team is evaluating prompting strategies for a financial document analysis tool. On complex multi-step calculations (e.g., calculating IRR across multiple cash flows), CoT dramatically improves accuracy. But on simple extraction tasks (e.g., 'What is the stated revenue on line 12?'), CoT actually decreases accuracy and increases latency. How do you reconcile this?
Your application requests JSON output via a system prompt instruction ('Always respond in valid JSON'). In production, approximately 0.3% of responses are malformed JSON, causing 500 errors. The errors cluster around complex nested schemas. Describe the fix.
You deploy a research agent tasked with 'Find the most recent and comprehensive information about X.' The agent begins iterating — searching, reading, searching again with refined terms — and never terminates. After 47 tool caae and 12 minutes, it is still running. What failed and how do you prevent this?
You have been using BLEU score to evaluate your document summarisation model. BLEU scores average 0.42 (high for the task). However, in a human evaluation study, 35% of summaries are rated as 'poor quality' by domain experts. How do you explain this discrepancy and fix your evaluation approach?
A customer service bot works perfectly for the first 10 turns of a conversation. In turn 15, it starts ignoring its system prompt instructions (e.g., never offering refunds without manager approval). Investigation reveals the conversation history has grown to fill the entire context window, truncating the system prompt. What should have been done architecturally?
You now have the conceptual foundation. Expect applied-reasoning questions — read each scenario and identify which automation control prevents the described failure.
Purchase Exam Access — $79 →