New from the Lab·The Compass — an open moral reasoning standard for AI, tested across frontier modelsExplore →
Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
MSP PlaybookAgent DeploymentsSovereign Infrastructure

Build a Managed AI Agent Practice with Sovereign Infrastructure

Most MSPs selling AI services are reselling SaaS subscriptions — CoPilot seats, ChatGPT Enterprise licences, or third-party AI tools. A sovereign agent deployment practice is different: you deploy and operate AI agent infrastructure inside the client's own environment, with certified practitioners, documented governance, and a managed operating model. This playbook explains how to build it.

Why sovereign infrastructure is the MSP opportunitySome clients in financial services, healthcare, legal, government, or other high-sensitivity settings need tighter control over where data is processed, how access is governed, and what evidence is available later. In those cases, sovereign self-hosted infrastructure may be the right answer because the runtime and operational data remain inside a client-controlled perimeter.

The reference stack

Agent runtime
Trinity (Ability.AI)
Self-hosted, open-source, security-audited
Infrastructure
Docker / Kubernetes
Isolation, reproducibility, resource limits
Agent state
Client GitHub
Version-controlled, client-owned
Task queue
Redis Streams
Durable agent task queueing
LLM provider
OpenAI / Anthropic / Ollama
Client-provisioned API credentials
Observability
OpenTelemetry + client stack
Traces, costs, alerting
Human oversight
Trinity approval queue
Built-in, configurable per action
Audit log
Trinity (hash-chained)
Tamper-evident, exportable

This is a reference stack, not a mandate. Clients with existing Kubernetes infrastructure, GitLab instead of GitHub, or on-premise LLMs (via Ollama or vLLM) can substitute accordingly. The sovereignty principle holds as long as the runtime and data stay within the client's perimeter.

The deployment playbook

Six phases from certification to managed service. Each phase has defined deliverables, owners, and time estimates. The full cycle from certification to first production agent runs 6–8 weeks.

01
Get certified
Week 1–2

Before you deploy anything for a client, your team needs the credentials that prove you understand what you are doing. In the agent deployment practice, two certifications are essential and one is strongly recommended.

Owner: The person who will operate the deployed agent environment day-to-day.
CAOP covers the eight PSF domains as they apply to running production agent systems: input governance, output validation, data protection, observability, deployment safety, human oversight, security, and vendor resilience. It is the credential that clients in regulated sectors will ask for.
10–12 hours study + 90-min exam
Owner: The practitioner responsible for governance, risk framing, and operating controls.
CAIG covers governance, accountability, risk tiering, and audit-ready operating practice for production AI. It is useful when the client conversation extends beyond implementation into who owns the system, what controls apply, and how the deployment will be governed over time.
8–10 hours study + 75-min exam
Owner: The team member who produces governance reports for regulated clients.
Some clients need periodic assurance work once an agent is in production. CAIA is the relevant pathway for practitioners conducting evidence-based reviews, documenting findings, and assessing deployments against the PSF.
8–10 hours study + 75-min exam
02
Understand the sovereign architecture
Week 2

The MSP agent deployment model is fundamentally different from reselling SaaS AI tools. You are deploying infrastructure inside the client's environment (or a dedicated environment you manage on their behalf). Understanding this distinction is essential for the client conversation, the scope-of-work, and the ongoing managed service.

Owner: Everyone in the practice.
In a sovereign agent deployment, the runtime (Trinity) runs inside Docker on the client's infrastructure — their on-premise servers, their private cloud, or a dedicated managed environment you provision for them. All data processed by agents stays within that perimeter. Agent state is stored in the client's GitHub repository. No operational data is transmitted to Ability.AI or any other third party. This is the answer to 'where does our data go?' — and for regulated clients, it is the only acceptable answer.
Read Trinity PSF assessment + architecture docs
Infrastructure prerequisites
Technical
Owner: Lead engineer.
Trinity requires: Docker (or Kubernetes for larger deployments), a GitHub account or self-hosted Gitea for agent state, Redis (for task queuing via Redis Streams), and the LLM API credentials the client will use (OpenAI, Anthropic, or a self-hosted model via Ollama). For most MSP client environments, this means confirming Docker is available and provisioning a small Redis instance — the additional infrastructure footprint is modest.
2–4 hours environment assessment
03
Define the first agent use case
Week 2–3

The most common mistake in an agent deployment engagement is trying to automate everything at once. Start with one well-defined use case where the agent's scope is narrow, the data is relatively non-sensitive, and the value is demonstrable within 30 days. This builds client confidence and gives you a reference deployment.

Owner: Account manager + CAOP-certified operator.
Use the PAI discovery framework to identify workflows in the client's environment that are high-repetition, rules-based, and currently performed by people. Focus on tasks where the output is reviewable before it acts — this means you can deploy with an approval queue and give the client visibility into every agent action before it executes. Good first agents: support ticket triage, invoice routing, document classification, internal query answering.
2–3 hour structured session
Use case scoping document
Deliverable
Owner: CAOP operator + account manager.
Before configuring anything, produce a one-page use case scope document covering: the task the agent will perform, the data it will access (and what it will not access), the channels it will be reachable through, the approval rules (what requires human sign-off before the agent acts), the success metric (how you will know it is working), and the exit criteria (what causes the agent to be shut down or referred to a human). This document becomes the configuration specification and the governance reference.
Half day
04
Deploy and configure Trinity
Week 3–4

Trinity is deployed as a Docker container into the client's environment. Configuration is done through Trinity's agent definition files, which live in the client's GitHub repository. Every change to an agent's configuration is version-controlled and auditable.

Environment provisioning
Technical
Owner: Delivery engineer.
Provision the Docker host (or Kubernetes namespace), Redis instance, and GitHub repository. Configure network access: Trinity needs outbound access to the LLM API endpoint and to any client systems the agent will integrate with. It does not need inbound access from the internet unless you are exposing a public webhook for external triggers. Establish separate credential scopes for the development instance and the production instance.
4–8 hours
Owner: CAOP operator.
Define the agent in Trinity's configuration: name, description, system prompt, permitted tools, channel bindings (which Slack channels or webhooks can trigger it), RBAC restrictions (which user roles can interact with it), and approval rules (which action types require human sign-off before execution). The guardrails are configured here — this is the PSF D1 and D5 work. Every tool the agent has access to should be the minimum required for the use case. Never grant an agent write access to a system it only needs read access to.
4–8 hours per agent
Owner: CAOP operator.
Configure Trinity's OpenTelemetry exporter to send traces to your observability stack (or the client's). Set up dashboards for agent execution volume, error rate, approval queue depth, and LLM cost per agent. Configure alerting for anomalous execution patterns — an agent that executes 10× its expected daily volume should trigger a review before it consumes significant API spend. The tamper-evident audit log is on by default — confirm it is being written to durable storage.
2–4 hours
Owner: CAOP operator + client nominated reviewer.
Configure the approval queue for every action type that is irreversible or customer-facing. Assign the client's nominated reviewer to these approval tasks. Walk the reviewer through the approval interface: how they receive notification, what information is presented, how to approve or reject, and what happens to rejected actions. This is the human oversight layer — get the client's reviewer comfortable with it before going live.
2–3 hours including client walkthrough
05
Run a supervised pilot
Week 4–6

Before handing the deployment to the client for unsupervised use, run a two-week supervised pilot. Every agent action goes through the approval queue. You review alongside the client. You build the evidence that the agent is behaving as expected and that the client is comfortable with the oversight model.

Pilot period monitoring
Process
Owner: CAOP operator.
During the pilot, review the observability dashboard daily. Check approval queue items together with the client's reviewer. Flag any agent outputs that are unexpected, off-topic, or incorrect. Document each instance in your incident log — not because they are necessarily serious, but because the client needs to see that you are tracking the agent's behaviour systematically. A clean incident log is evidence of a well-configured deployment; a few documented minor incidents with appropriate responses is evidence of a mature monitoring practice.
30–60 mins/day during pilot
Pilot review and sign-off
Deliverable
Owner: CAOP operator + client stakeholder.
At the end of the pilot, produce a pilot review document: number of tasks processed, number of approval queue decisions, number of incidents, cost vs. estimate, client reviewer feedback. Get written sign-off from the client stakeholder before moving to production operation. This sign-off is your evidence of informed consent for the production deployment — it matters if a regulator ever asks how you validated the deployment before going live.
Half day
06
Operate as a managed service
Ongoing

Deployment is only the beginning. Ongoing managed operation covers monitoring, incident response, model updates, capability expansion, and periodic assurance work where the client needs it.

Monthly managed service deliverables
Service
Owner: CAOP operator.
Each month: review observability dashboards for drift in agent behaviour (volume, error rate, cost per run), review any incidents and their resolutions, update agent configurations in response to client feedback or changing requirements, review LLM model versions and assess whether an update is appropriate, and produce a one-page operational summary for the client stakeholder. This summary is the evidence that the managed service is active and that the client is getting value.
4–6 hrs/month per deployment
Owner: CAIA-certified auditor.
Where the client requires periodic assurance, produce a quarterly review covering PSF domain status for each deployed agent, audit-log review for the period, incidents and root causes, confirmation that data-protection controls remain effective, and recommendations for the next quarter.
1–2 days per client per quarter
Capability expansion
Growth
Owner: Governance lead + CAOP operator.
Once the first agent is stable, revisit the discovery backlog and scope the next use case only where there is clear value, a named owner, and enough governance capacity to operate it well. Reuse the same scoping, deployment, and pilot discipline each time.
Scoped per use case

Pricing reference

These are reference ranges in GBP. Adjust for your market, client size, and the complexity of the client's environment. Setup fees reflect the one-time cost of deployment, configuration, and the supervised pilot. Monthly fees reflect ongoing managed operation, monitoring, and monthly reporting.

Starter Deployment
Setup: £4,500–£7,500
Monthly: £800–£1,500/mo
  • 1–2 agents
  • CAOP operator certification on team
  • Docker + Redis provisioning
  • 2-week supervised pilot
  • Monthly operational summary
  • Approval queue configuration
First agent deployment for SME clients. Good reference engagement for building the practice.
Most common
Regulated Client Package
Setup: £10,000–£18,000
Monthly: £2,000–£3,500/mo
  • 3–5 agents
  • CAOP + CAIA certified team
  • Full PSF compliance configuration
  • Quarterly audit reports
  • OTel observability stack
  • Incident response SLA
  • Client reviewer training
Financial services, healthcare, legal, and government clients with compliance obligations.
Enterprise Sovereign Platform
Setup: £25,000–£50,000
Monthly: £5,000–£10,000/mo
  • Unlimited agents
  • Kubernetes deployment
  • Full team certification
  • Custom compliance reporting
  • Executive briefings
  • Priority incident response
  • Annual PSF gap assessment
Large enterprises requiring sovereign infrastructure across multiple departments or divisions.

Certifications for the practice

Three PAI credentials are relevant to this kind of work: CAOP for operation, CAIG for governance, and CAIA for evidence-based review.

CAOP
Certified Agent Operator

Operating production agent systems: PSF compliance, observability, approval queues, incident response, model updates.

Practice role: operation
CAIG
Certified AI Governance Professional

Governance of production AI systems: accountability, risk tiering, control design, and operating evidence.

Practice role: governance
CAIA
Certified AI Auditor

Producing compliance audit reports: PSF gap assessment, audit log review, regulatory evidence packages, board-level reporting.

Practice role: review

Join the PAI MSP Programme

The PAI MSP Programme supports managed service providers building an AI agent practice. Review the discovery framework, proposal templates, client briefing decks, and credential pathways used across the programme.

MSP programme detailsTrinity PSF assessment →