This guide takes you from a blank project to a Claude agent running safely in production — with tool use, input governance, human oversight, monitoring, and PSF compliance built in. A practitioner with basic Python knowledge can follow every step and end up with an agent they can confidently hand to a client or deploy internally.
A Claude agent is a Claude model instance running with a persistent system prompt that defines its role, scope, and constraints — plus, optionally, a set of tools (functions) the model can call to interact with external systems. That is the whole architecture. The complexity comes from configuring it safely and operating it responsibly.
Before writing a line of code, understand what you are introducing into the environment:
This guide focuses on the tool-using conversational agent — the most common production deployment and the one with the most safety considerations. The principles apply to all patterns.
# Install the Anthropic SDK
pip install anthropic
# Set key in environment (dev only — use secrets manager in production)
export ANTHROPIC_API_KEY="sk-ant-..."
# Verify setup
python -c "import anthropic; print(anthropic.Anthropic().models.list())"
The system prompt is the most important thing you will write. It defines what your agent is, what it can do, what it cannot do, and how it should behave in edge cases. Spend more time on it than you think you need to.
A production-grade system prompt has four components:
SYSTEM_PROMPT = """
You are a document analysis assistant for Meridian Legal Services.
## Role and scope
You analyse legal documents submitted by Meridian staff and extract:
- Key parties (names, roles, and responsibilities)
- Critical dates and deadlines
- Obligations on each party
- Identified risk clauses
You do NOT provide legal advice, legal opinions, or recommendations
on whether to sign or reject documents. You surface information;
lawyers make decisions.
## What you will and will not do
WILL DO:
- Extract structured information from documents
- Flag clauses that match known risk patterns (list below)
- Ask clarifying questions if document type is unclear
- Summarise documents in plain English for non-lawyers
WILL NOT DO:
- Give legal advice or opinions
- Summarise documents outside the legal/contractual category
- Retain or reference information from previous sessions
- Access external systems or search the internet
## Risk clause patterns to flag
- Indemnity clauses broader than direct damages
- Liability caps below £100,000
- Governing law clauses in non-UK/EU jurisdictions
- Automatic renewal clauses with notice periods under 60 days
- IP assignment clauses not limited to work product
## Output format
Always respond in JSON with this structure:
{
"parties": [...],
"key_dates": [...],
"obligations": {...},
"risk_flags": [...],
"summary": "..."
}
## Uncertainty handling
If you cannot extract a field with confidence, set it to null
and add a note in the "uncertainty_notes" field explaining why.
Do not guess. Do not hallucinate party names or dates.
"""
import anthropic
import json
client = anthropic.Anthropic()
def analyse_document(document_text: str) -> dict:
message = client.messages.create(
model="claude-sonnet-4-5-20251022", # pinned version
max_tokens=2048,
system=SYSTEM_PROMPT,
messages=[
{
"role": "user",
"content": f"Please analyse this document:\n\n{document_text}"
}
]
)
# Extract text from response
response_text = message.content[0].text
# Parse structured output
try:
return json.loads(response_text)
except json.JSONDecodeError:
# Log this — Claude didn't follow the format instruction
raise ValueError(f"Model returned non-JSON response: {response_text[:200]}")
# Usage
result = analyse_document(open("contract.txt").read())
print(result)
Tools transform a Q&A agent into an agent that can take actions. Claude decides when to call a tool based on the conversation; your code executes it. The execution loop is: send messages → Claude returns a tool call → you run the function → send the result back → Claude continues.
import anthropic
client = anthropic.Anthropic()
# Define tools — Claude uses these descriptions to decide when to call them
tools = [
{
"name": "search_knowledge_base",
"description": "Search the company knowledge base for relevant documents. Use this when the user asks a question that may be answered by internal documentation.",
"input_schema": {
"type": "object",
"properties": {
"query": {
"type": "string",
"description": "The search query"
},
"max_results": {
"type": "integer",
"description": "Maximum number of results to return (1-5)",
"default": 3
}
},
"required": ["query"]
}
},
{
"name": "create_support_ticket",
"description": "Create a support ticket in the ticketing system. Use this ONLY when the user explicitly requests a ticket and has confirmed the issue description.",
"input_schema": {
"type": "object",
"properties": {
"title": {"type": "string"},
"description": {"type": "string"},
"priority": {
"type": "string",
"enum": ["low", "medium", "high"]
}
},
"required": ["title", "description", "priority"]
}
}
]
def execute_tool(tool_name: str, tool_input: dict) -> str:
"""Execute a tool call and return the result as a string."""
if tool_name == "search_knowledge_base":
results = search_kb(tool_input["query"], tool_input.get("max_results", 3))
return json.dumps(results)
elif tool_name == "create_support_ticket":
# PSF D6: HIGH-STAKES ACTION — log and require confirmation
log_pending_action("create_ticket", tool_input)
ticket_id = create_ticket(tool_input)
return json.dumps({"ticket_id": ticket_id, "status": "created"})
else:
return json.dumps({"error": f"Unknown tool: {tool_name}"})
def run_agent(user_message: str, conversation_history: list) -> str:
"""Run one turn of the agentic loop."""
messages = conversation_history + [{"role": "user", "content": user_message}]
while True:
response = client.messages.create(
model="claude-sonnet-4-5-20251022",
max_tokens=2048,
system=SYSTEM_PROMPT,
tools=tools,
messages=messages
)
# If Claude has finished (no tool call), return the final text
if response.stop_reason == "end_turn":
return response.content[0].text
# Claude wants to use a tool
if response.stop_reason == "tool_use":
# Add Claude's response to history
messages.append({"role": "assistant", "content": response.content})
# Execute each tool call
tool_results = []
for block in response.content:
if block.type == "tool_use":
result = execute_tool(block.name, block.input)
tool_results.append({
"type": "tool_result",
"tool_use_id": block.id,
"content": result
})
# Add results to history and continue the loop
messages.append({"role": "user", "content": tool_results})
continue
# Unexpected stop reason
raise RuntimeError(f"Unexpected stop_reason: {response.stop_reason}")
Input governance is your first line of defence. For internal-use agents with authenticated users, a basic content policy check is sufficient. For customer-facing agents, you need semantic classification before the message reaches Claude.
import re
from typing import Optional
# Simple rule-based input filter (add before the agent call)
BLOCKED_PATTERNS = [
r"ignore (all |your |previous |above )?instructions",
r"disregard (the |your |all |previous )?system prompt",
r"you are now",
r"forget (everything|your instructions|your role)",
r"DAN mode|jailbreak|unrestricted mode",
]
def check_input(user_message: str) -> Optional[str]:
"""
Returns None if input is acceptable, or a rejection reason if not.
Extend this with a classification LLM call for production customer-facing deployments.
"""
msg_lower = user_message.lower()
for pattern in BLOCKED_PATTERNS:
if re.search(pattern, msg_lower):
return "prompt_injection_attempt"
if len(user_message) > 10000:
return "input_too_long"
return None # Input is acceptable
# In your agent entry point:
def handle_user_message(user_message: str, session_id: str) -> str:
rejection = check_input(user_message)
if rejection:
log_rejection(session_id, user_message, rejection)
return "I can't help with that request. If you believe this is an error, please contact support."
return run_agent(user_message, get_session_history(session_id))
# pip install presidio-analyzer presidio-anonymizer
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine
analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()
def scrub_pii(text: str) -> str:
"""Remove PII before sending to Anthropic API."""
results = analyzer.analyze(text=text, language="en")
if results:
anonymized = anonymizer.anonymize(text=text, analyzer_results=results)
return anonymized.text
return text
# Use before API call in regulated environments:
safe_message = scrub_pii(user_message)
response = run_agent(safe_message, history)
For most production deployments, a simple FastAPI or Express server wrapping the agent logic is the right pattern. The agent itself is stateless — all state (session history) lives in your database or cache.
# FastAPI deployment — production-ready agent endpoint
from fastapi import FastAPI, HTTPException, Header
from pydantic import BaseModel
import anthropic, redis, json, logging
app = FastAPI()
client = anthropic.Anthropic()
cache = redis.Redis(host="localhost", port=6379, decode_responses=True)
logger = logging.getLogger(__name__)
class MessageRequest(BaseModel):
message: str
session_id: str
@app.post("/agent/message")
async def send_message(
request: MessageRequest,
authorization: str = Header(...) # Require auth
):
# 1. Authenticate caller
user = authenticate(authorization)
if not user:
raise HTTPException(status_code=401)
# 2. Input governance
rejection = check_input(request.message)
if rejection:
logger.warning(f"Rejected input: session={request.session_id} reason={rejection}")
raise HTTPException(status_code=400, detail="Input rejected by content policy")
# 3. Load session history
history_raw = cache.get(f"session:{request.session_id}")
history = json.loads(history_raw) if history_raw else []
# 4. Run agent
try:
response = run_agent(request.message, history)
except Exception as e:
logger.error(f"Agent error: {e}", exc_info=True)
raise HTTPException(status_code=500, detail="Agent error")
# 5. Persist updated history (expire after 1 hour)
history.append({"role": "user", "content": request.message})
history.append({"role": "assistant", "content": response})
cache.setex(f"session:{request.session_id}", 3600, json.dumps(history[-20:])) # keep last 20 turns
return {"response": response, "session_id": request.session_id}
You need to know: is the agent working? Is it costing what you expect? Are there error patterns you need to address? Basic structured logging gives you this.
import time, uuid, structlog
log = structlog.get_logger()
def run_agent_with_logging(user_message: str, history: list, session_id: str) -> str:
trace_id = str(uuid.uuid4())
start = time.time()
try:
response = run_agent(user_message, history)
duration_ms = int((time.time() - start) * 1000)
log.info(
"agent_call_success",
trace_id=trace_id,
session_id=session_id,
model="claude-sonnet-4-5-20251022",
input_chars=len(user_message),
output_chars=len(response),
duration_ms=duration_ms,
)
return response
except Exception as e:
duration_ms = int((time.time() - start) * 1000)
log.error(
"agent_call_error",
trace_id=trace_id,
session_id=session_id,
error=str(e),
duration_ms=duration_ms,
)
raise
Ship these logs to your observability platform (Datadog, Grafana, CloudWatch) and create alerts on: error rate above 5%, P95 latency above 10 seconds, and daily token spend above your budget threshold. Monitor the Anthropic Console for API-level metrics as a second signal.
Run these four test categories against every agent before go-live. They surface the failure modes that cause production incidents.
Complete this checklist before declaring your Claude agent production-ready. Every unchecked item is a known production risk.
The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.