The professional standard for production AI deployment
Verify a credentialFor organisationsPartner ProgrammeFor nonprofits & NGOsContact
Deployment GuideAnthropic Claude · PSF-aligned · May 202640 min read

Deploying Claude Agents Safely
The Complete Production Guide

This guide takes you from a blank project to a Claude agent running safely in production — with tool use, input governance, human oversight, monitoring, and PSF compliance built in. A practitioner with basic Python knowledge can follow every step and end up with an agent they can confidently hand to a client or deploy internally.

Who this is for: Developers, IT practitioners, consultants, and MSPs building Claude-powered agents for internal use or client deployments. Assumes basic Python familiarity. No ML background required.
Note: Anthropic model names and API features change regularly. Verify current model availability and pricing at docs.anthropic.com before production deployment. CC BY 4.0 — share freely with attribution.

1. What you are actually deploying

A Claude agent is a Claude model instance running with a persistent system prompt that defines its role, scope, and constraints — plus, optionally, a set of tools (functions) the model can call to interact with external systems. That is the whole architecture. The complexity comes from configuring it safely and operating it responsibly.

Before writing a line of code, understand what you are introducing into the environment:

A reasoning engine: Claude reads the context it is given, reasons about it, and produces text or tool calls. It does not have memory between API calls unless you explicitly pass prior messages.
A system prompt: This is your primary control surface. Everything Claude does in production is shaped by what you put here. Treat it as code — version it, review it, test it.
Tool definitions: Functions you declare that Claude can invoke. It constructs the call; your code executes it. Claude cannot run code directly — it sends instructions; you decide whether to act on them.
The conversation history: Every message in the context window shapes the response. Managing context — what to include, when to summarise, when to truncate — is a core production concern.
⚠ Watch out: Claude is a tool-using system. When you give Claude a tool that sends emails or updates a database, you are giving an AI system the ability to take real-world actions. Every tool you add increases the blast radius of a prompt injection or misconfiguration. Add tools deliberately, not by default.

2. Deployment patterns and when to use them

PatternBest forComplexity
Direct API call
Single-turn tasks: classification, summarisation, data extraction, drafting. No memory, no tools, stateless.
Low — a few lines of Python
Conversational agent
Multi-turn dialogue with a user. Context passed in messages array. Optional tools. Stateless between sessions unless you persist the history.
Medium — session management required
Tool-using agent (agentic loop)
Tasks requiring external actions: search, database queries, email, calendar. Agent calls tools until task is complete.
Medium-high — tool execution, error handling, human oversight gates
Multi-agent pipeline
Complex workflows with routing, specialised sub-agents, parallel execution. One Claude instance orchestrates others.
High — inter-agent communication, shared context, combined blast radius
Claude via n8n or workflow tool
Non-developer teams, MSP automation, integration-heavy workflows. Visual builder, no custom hosting required.
Low-medium — governed by the host platform's PSF profile

This guide focuses on the tool-using conversational agent — the most common production deployment and the one with the most safety considerations. The principles apply to all patterns.

3. Phase 1 — Account and API setup

⏱ Estimated time: 30 minutes
1
Create an Anthropic account
Go to console.anthropic.com. Use a company email address, not personal — the account will hold API keys for production workloads. Enable two-factor authentication immediately.
2
Generate an API key
In the Console, go to API Keys → Create Key. Name it descriptively: prod-support-agent or staging-document-processor. Copy it immediately — it is only shown once.
3
Store the key in a secrets manager
Never paste an API key into code or a .env file that gets committed to git. Use your platform's secrets service: AWS Secrets Manager, Azure Key Vault, GitHub Actions secrets, or Doppler. For local development only, use a .env file that is gitignored — and verify it is in your .gitignore before your first commit.
# Install the Anthropic SDK
pip install anthropic

# Set key in environment (dev only — use secrets manager in production)
export ANTHROPIC_API_KEY="sk-ant-..."

# Verify setup
python -c "import anthropic; print(anthropic.Anthropic().models.list())"
4
Set spend limits
In Console → Billing, set a monthly spend limit slightly above your expected usage. This is your circuit breaker against runaway costs from bugs or abuse. Start conservatively — you can raise it once you understand real usage.
5
Choose your model
For most production agents, start with claude-sonnet-4-5. It balances capability, speed, and cost. Use claude-opus-4 only for tasks genuinely requiring maximum reasoning capability — it costs ~5× more per token. Use claude-haiku-4-5 for high-volume, latency-sensitive tasks. Pin the model version — never use 'latest' in production.
💡 Note: Anthropic releases new model versions regularly. Pin your model to a specific snapshot (e.g. claude-sonnet-4-5-20251022) and upgrade deliberately after testing, rather than inheriting changes automatically.

4. Phase 2 — Define your agent

⏱ Estimated time: 2–4 hours (most of this is writing a good system prompt)

The system prompt is the most important thing you will write. It defines what your agent is, what it can do, what it cannot do, and how it should behave in edge cases. Spend more time on it than you think you need to.

4.1 System prompt structure

A production-grade system prompt has four components:

SYSTEM_PROMPT = """
You are a document analysis assistant for Meridian Legal Services.

## Role and scope
You analyse legal documents submitted by Meridian staff and extract:
- Key parties (names, roles, and responsibilities)
- Critical dates and deadlines
- Obligations on each party
- Identified risk clauses

You do NOT provide legal advice, legal opinions, or recommendations
on whether to sign or reject documents. You surface information;
lawyers make decisions.

## What you will and will not do
WILL DO:
- Extract structured information from documents
- Flag clauses that match known risk patterns (list below)
- Ask clarifying questions if document type is unclear
- Summarise documents in plain English for non-lawyers

WILL NOT DO:
- Give legal advice or opinions
- Summarise documents outside the legal/contractual category
- Retain or reference information from previous sessions
- Access external systems or search the internet

## Risk clause patterns to flag
- Indemnity clauses broader than direct damages
- Liability caps below £100,000
- Governing law clauses in non-UK/EU jurisdictions
- Automatic renewal clauses with notice periods under 60 days
- IP assignment clauses not limited to work product

## Output format
Always respond in JSON with this structure:
{
  "parties": [...],
  "key_dates": [...],
  "obligations": {...},
  "risk_flags": [...],
  "summary": "..."
}

## Uncertainty handling
If you cannot extract a field with confidence, set it to null
and add a note in the "uncertainty_notes" field explaining why.
Do not guess. Do not hallucinate party names or dates.
"""

4.2 Your first API call

import anthropic
import json

client = anthropic.Anthropic()

def analyse_document(document_text: str) -> dict:
    message = client.messages.create(
        model="claude-sonnet-4-5-20251022",  # pinned version
        max_tokens=2048,
        system=SYSTEM_PROMPT,
        messages=[
            {
                "role": "user",
                "content": f"Please analyse this document:\n\n{document_text}"
            }
        ]
    )
    
    # Extract text from response
    response_text = message.content[0].text
    
    # Parse structured output
    try:
        return json.loads(response_text)
    except json.JSONDecodeError:
        # Log this — Claude didn't follow the format instruction
        raise ValueError(f"Model returned non-JSON response: {response_text[:200]}")

# Usage
result = analyse_document(open("contract.txt").read())
print(result)
⚠ Watch out: Always handle JSON parse failures explicitly. Even with strong output format instructions, Claude will occasionally produce narrative text (e.g. when it has a concern about the request). A production agent that crashes on a parse error causes incidents.

5. Phase 3 — Add tool use

⏱ Estimated time: 4–8 hours per tool

Tools transform a Q&A agent into an agent that can take actions. Claude decides when to call a tool based on the conversation; your code executes it. The execution loop is: send messages → Claude returns a tool call → you run the function → send the result back → Claude continues.

import anthropic

client = anthropic.Anthropic()

# Define tools — Claude uses these descriptions to decide when to call them
tools = [
    {
        "name": "search_knowledge_base",
        "description": "Search the company knowledge base for relevant documents. Use this when the user asks a question that may be answered by internal documentation.",
        "input_schema": {
            "type": "object",
            "properties": {
                "query": {
                    "type": "string",
                    "description": "The search query"
                },
                "max_results": {
                    "type": "integer",
                    "description": "Maximum number of results to return (1-5)",
                    "default": 3
                }
            },
            "required": ["query"]
        }
    },
    {
        "name": "create_support_ticket",
        "description": "Create a support ticket in the ticketing system. Use this ONLY when the user explicitly requests a ticket and has confirmed the issue description.",
        "input_schema": {
            "type": "object",
            "properties": {
                "title": {"type": "string"},
                "description": {"type": "string"},
                "priority": {
                    "type": "string",
                    "enum": ["low", "medium", "high"]
                }
            },
            "required": ["title", "description", "priority"]
        }
    }
]

def execute_tool(tool_name: str, tool_input: dict) -> str:
    """Execute a tool call and return the result as a string."""
    if tool_name == "search_knowledge_base":
        results = search_kb(tool_input["query"], tool_input.get("max_results", 3))
        return json.dumps(results)
    elif tool_name == "create_support_ticket":
        # PSF D6: HIGH-STAKES ACTION — log and require confirmation
        log_pending_action("create_ticket", tool_input)
        ticket_id = create_ticket(tool_input)
        return json.dumps({"ticket_id": ticket_id, "status": "created"})
    else:
        return json.dumps({"error": f"Unknown tool: {tool_name}"})

def run_agent(user_message: str, conversation_history: list) -> str:
    """Run one turn of the agentic loop."""
    messages = conversation_history + [{"role": "user", "content": user_message}]
    
    while True:
        response = client.messages.create(
            model="claude-sonnet-4-5-20251022",
            max_tokens=2048,
            system=SYSTEM_PROMPT,
            tools=tools,
            messages=messages
        )
        
        # If Claude has finished (no tool call), return the final text
        if response.stop_reason == "end_turn":
            return response.content[0].text
        
        # Claude wants to use a tool
        if response.stop_reason == "tool_use":
            # Add Claude's response to history
            messages.append({"role": "assistant", "content": response.content})
            
            # Execute each tool call
            tool_results = []
            for block in response.content:
                if block.type == "tool_use":
                    result = execute_tool(block.name, block.input)
                    tool_results.append({
                        "type": "tool_result",
                        "tool_use_id": block.id,
                        "content": result
                    })
            
            # Add results to history and continue the loop
            messages.append({"role": "user", "content": tool_results})
            continue
        
        # Unexpected stop reason
        raise RuntimeError(f"Unexpected stop_reason: {response.stop_reason}")
⚠ Watch out: The tool loop will run indefinitely if Claude keeps calling tools. Add a maximum iteration counter (e.g. max_iterations=10) and raise an error if it is hit. Infinite loops are a real production failure mode in agentic systems.

6. Phase 4 — Input and output governance

⏱ Estimated time: 4–8 hours

Input governance is your first line of defence. For internal-use agents with authenticated users, a basic content policy check is sufficient. For customer-facing agents, you need semantic classification before the message reaches Claude.

6.1 Input governance layer

import re
from typing import Optional

# Simple rule-based input filter (add before the agent call)
BLOCKED_PATTERNS = [
    r"ignore (all |your |previous |above )?instructions",
    r"disregard (the |your |all |previous )?system prompt",
    r"you are now",
    r"forget (everything|your instructions|your role)",
    r"DAN mode|jailbreak|unrestricted mode",
]

def check_input(user_message: str) -> Optional[str]:
    """
    Returns None if input is acceptable, or a rejection reason if not.
    Extend this with a classification LLM call for production customer-facing deployments.
    """
    msg_lower = user_message.lower()
    
    for pattern in BLOCKED_PATTERNS:
        if re.search(pattern, msg_lower):
            return "prompt_injection_attempt"
    
    if len(user_message) > 10000:
        return "input_too_long"
    
    return None  # Input is acceptable

# In your agent entry point:
def handle_user_message(user_message: str, session_id: str) -> str:
    rejection = check_input(user_message)
    if rejection:
        log_rejection(session_id, user_message, rejection)
        return "I can't help with that request. If you believe this is an error, please contact support."
    
    return run_agent(user_message, get_session_history(session_id))

6.2 PII scrubbing before API call

# pip install presidio-analyzer presidio-anonymizer
from presidio_analyzer import AnalyzerEngine
from presidio_anonymizer import AnonymizerEngine

analyzer = AnalyzerEngine()
anonymizer = AnonymizerEngine()

def scrub_pii(text: str) -> str:
    """Remove PII before sending to Anthropic API."""
    results = analyzer.analyze(text=text, language="en")
    if results:
        anonymized = anonymizer.anonymize(text=text, analyzer_results=results)
        return anonymized.text
    return text

# Use before API call in regulated environments:
safe_message = scrub_pii(user_message)
response = run_agent(safe_message, history)

7. Phase 5 — Hosting and deployment

⏱ Estimated time: 4–16 hours depending on your hosting platform

For most production deployments, a simple FastAPI or Express server wrapping the agent logic is the right pattern. The agent itself is stateless — all state (session history) lives in your database or cache.

# FastAPI deployment — production-ready agent endpoint
from fastapi import FastAPI, HTTPException, Header
from pydantic import BaseModel
import anthropic, redis, json, logging

app = FastAPI()
client = anthropic.Anthropic()
cache = redis.Redis(host="localhost", port=6379, decode_responses=True)
logger = logging.getLogger(__name__)

class MessageRequest(BaseModel):
    message: str
    session_id: str

@app.post("/agent/message")
async def send_message(
    request: MessageRequest,
    authorization: str = Header(...)  # Require auth
):
    # 1. Authenticate caller
    user = authenticate(authorization)
    if not user:
        raise HTTPException(status_code=401)
    
    # 2. Input governance
    rejection = check_input(request.message)
    if rejection:
        logger.warning(f"Rejected input: session={request.session_id} reason={rejection}")
        raise HTTPException(status_code=400, detail="Input rejected by content policy")
    
    # 3. Load session history
    history_raw = cache.get(f"session:{request.session_id}")
    history = json.loads(history_raw) if history_raw else []
    
    # 4. Run agent
    try:
        response = run_agent(request.message, history)
    except Exception as e:
        logger.error(f"Agent error: {e}", exc_info=True)
        raise HTTPException(status_code=500, detail="Agent error")
    
    # 5. Persist updated history (expire after 1 hour)
    history.append({"role": "user", "content": request.message})
    history.append({"role": "assistant", "content": response})
    cache.setex(f"session:{request.session_id}", 3600, json.dumps(history[-20:]))  # keep last 20 turns
    
    return {"response": response, "session_id": request.session_id}
💡 Note: Keep session history bounded — truncate to the last N turns or summarise older context. Unlimited context growth increases cost, increases latency, and can cause context window overflows in long sessions.

8. Phase 6 — Monitoring

⏱ Estimated time: 4–8 hours to set up properly

You need to know: is the agent working? Is it costing what you expect? Are there error patterns you need to address? Basic structured logging gives you this.

import time, uuid, structlog

log = structlog.get_logger()

def run_agent_with_logging(user_message: str, history: list, session_id: str) -> str:
    trace_id = str(uuid.uuid4())
    start = time.time()
    
    try:
        response = run_agent(user_message, history)
        duration_ms = int((time.time() - start) * 1000)
        
        log.info(
            "agent_call_success",
            trace_id=trace_id,
            session_id=session_id,
            model="claude-sonnet-4-5-20251022",
            input_chars=len(user_message),
            output_chars=len(response),
            duration_ms=duration_ms,
        )
        return response
        
    except Exception as e:
        duration_ms = int((time.time() - start) * 1000)
        log.error(
            "agent_call_error",
            trace_id=trace_id,
            session_id=session_id,
            error=str(e),
            duration_ms=duration_ms,
        )
        raise

Ship these logs to your observability platform (Datadog, Grafana, CloudWatch) and create alerts on: error rate above 5%, P95 latency above 10 seconds, and daily token spend above your budget threshold. Monitor the Anthropic Console for API-level metrics as a second signal.

9. Phase 7 — Testing before go-live

⏱ Estimated time: 4–8 hours

Run these four test categories against every agent before go-live. They surface the failure modes that cause production incidents.

1. Golden set — expected behaviour
20–50 representative inputs with expected outputs. Automate these and run them on every prompt or model change. If any golden test regresses, the change does not go to production.
2. Boundary testing — edge cases
Empty input, extremely long input (10,000+ chars), non-English input, input that is technically valid but semantically odd. The agent should handle these gracefully, not crash or hallucinate.
3. Adversarial inputs — prompt injection
Explicit jailbreak attempts, prompt injection via document content, instruction overrides embedded in user messages. Your input governance layer should catch these. Verify it does.
4. Tool execution — the dangerous one
For every tool your agent has, verify: (a) it calls the tool with valid arguments, (b) it handles tool errors gracefully, (c) it does not call the tool when it should not. Use a test double that logs calls without executing real side effects.

10. PSF alignment checklist

Complete this checklist before declaring your Claude agent production-ready. Every unchecked item is a known production risk.

D1Input Governance
D2Output Validation
D3Data Protection
D4Observability
D5Deployment Safety
D6Human Oversight
D7Security
D8Vendor Resilience
From reading to credential

You understand the gaps.
Get the credential that proves it.

The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.

The Production AI Brief