Production AI Institute · PSF v1.1 open standard

AI Right-To-Know AI Data Use Index Check My AI Tools Policy Change Watch Agent Readiness Public Benchmark ContactGlobal standard · Worldwide

MSP PlaybookAgent DeploymentsSovereign Infrastructure

Build a Managed AI Agent Practice with Sovereign Infrastructure

Most MSPs selling AI services are reselling SaaS subscriptions — CoPilot seats, ChatGPT Enterprise licences, or third-party AI tools. A sovereign agent deployment practice is different: you deploy and operate AI agent infrastructure inside the client's own environment, with certified practitioners, documented governance, and a managed operating model. This playbook explains how to build it.

Why sovereign infrastructure is the MSP opportunitySome clients in financial services, healthcare, legal, government, or other high-sensitivity settings need tighter control over where data is processed, how access is governed, and what evidence is available later. In those cases, sovereign self-hosted infrastructure may be the right answer because the runtime and operational data remain inside a client-controlled perimeter.

The reference stack

Agent runtime

Trinity (Ability.AI)

Self-hosted, open-source, security-audited

Infrastructure

Docker / Kubernetes

Isolation, reproducibility, resource limits

Agent state

Client GitHub

Version-controlled, client-owned

Task queue

Redis Streams

Durable agent task queueing

LLM provider

OpenAI / Anthropic / Ollama

Client-provisioned API credentials

Observability

OpenTelemetry + client stack

Traces, costs, alerting

Human oversight

Trinity approval queue

Built-in, configurable per action

Audit log

Trinity (hash-chained)

Tamper-evident, exportable

This is a reference stack, not a mandate. Clients with existing Kubernetes infrastructure, GitLab instead of GitHub, or on-premise LLMs (via Ollama or vLLM) can substitute accordingly. The sovereignty principle holds as long as the runtime and data stay within the client's perimeter.

The deployment playbook

Six phases from certification to managed service. Each phase has defined deliverables, owners, and time estimates. The full cycle from certification to first production agent runs 6–8 weeks.

Get certified

Week 1–2

Before you deploy anything for a client, your team needs the credentials that prove you understand what you are doing. In the agent deployment practice, two certifications are essential and one is strongly recommended.

CAOP — Certified Agent Operator

Essential

Owner: The person who will operate the deployed agent environment day-to-day.

CAOP covers the eight PSF domains as they apply to running production agent systems: input governance, output validation, data protection, observability, deployment safety, human oversight, security, and vendor resilience. It is the credential that clients in regulated sectors will ask for.

⏱ 10–12 hours study + 90-min exam

CAIG — Certified AI Governance Professional

Essential

Owner: The practitioner responsible for governance, risk framing, and operating controls.

CAIG covers governance, accountability, risk tiering, and audit-ready operating practice for production AI. It is useful when the client conversation extends beyond implementation into who owns the system, what controls apply, and how the deployment will be governed over time.

⏱ 8–10 hours study + 75-min exam

CAIA — Certified AI Auditor

Recommended

Owner: The team member who produces governance reports for regulated clients.

Some clients need periodic assurance work once an agent is in production. CAIA is the relevant pathway for practitioners conducting evidence-based reviews, documenting findings, and assessing deployments against the PSF.

⏱ 8–10 hours study + 75-min exam

Understand the sovereign architecture

Week 2

The MSP agent deployment model is fundamentally different from reselling SaaS AI tools. You are deploying infrastructure inside the client's environment (or a dedicated environment you manage on their behalf). Understanding this distinction is essential for the client conversation, the scope-of-work, and the ongoing managed service.

What 'sovereign' means in practice

Concept

Owner: Everyone in the practice.

In a sovereign agent deployment, the runtime (Trinity) runs inside Docker on the client's infrastructure — their on-premise servers, their private cloud, or a dedicated managed environment you provision for them. All data processed by agents stays within that perimeter. Agent state is stored in the client's GitHub repository. No operational data is transmitted to Ability.AI or any other third party. This is the answer to 'where does our data go?' — and for regulated clients, it is the only acceptable answer.

⏱ Read Trinity PSF assessment + architecture docs

Infrastructure prerequisites

Technical

Owner: Lead engineer.

Trinity requires: Docker (or Kubernetes for larger deployments), a GitHub account or self-hosted Gitea for agent state, Redis (for task queuing via Redis Streams), and the LLM API credentials the client will use (OpenAI, Anthropic, or a self-hosted model via Ollama). For most MSP client environments, this means confirming Docker is available and provisioning a small Redis instance — the additional infrastructure footprint is modest.

⏱ 2–4 hours environment assessment

Define the first agent use case

Week 2–3

The most common mistake in an agent deployment engagement is trying to automate everything at once. Start with one well-defined use case where the agent's scope is narrow, the data is relatively non-sensitive, and the value is demonstrable within 30 days. This builds client confidence and gives you a reference deployment.

Discovery conversation

Process

Owner: Account manager + CAOP-certified operator.

Use the PAI discovery framework to identify workflows in the client's environment that are high-repetition, rules-based, and currently performed by people. Focus on tasks where the output is reviewable before it acts — this means you can deploy with an approval queue and give the client visibility into every agent action before it executes. Good first agents: support ticket triage, invoice routing, document classification, internal query answering.

⏱ 2–3 hour structured session

Use case scoping document

Deliverable

Owner: CAOP operator + account manager.

Before configuring anything, produce a one-page use case scope document covering: the task the agent will perform, the data it will access (and what it will not access), the channels it will be reachable through, the approval rules (what requires human sign-off before the agent acts), the success metric (how you will know it is working), and the exit criteria (what causes the agent to be shut down or referred to a human). This document becomes the configuration specification and the governance reference.

⏱ Half day

Deploy and configure Trinity

Week 3–4

Trinity is deployed as a Docker container into the client's environment. Configuration is done through Trinity's agent definition files, which live in the client's GitHub repository. Every change to an agent's configuration is version-controlled and auditable.

Environment provisioning

Technical

Owner: Delivery engineer.

Provision the Docker host (or Kubernetes namespace), Redis instance, and GitHub repository. Configure network access: Trinity needs outbound access to the LLM API endpoint and to any client systems the agent will integrate with. It does not need inbound access from the internet unless you are exposing a public webhook for external triggers. Establish separate credential scopes for the development instance and the production instance.

⏱ 4–8 hours

Agent definition and guardrails

Technical

Owner: CAOP operator.

Define the agent in Trinity's configuration: name, description, system prompt, permitted tools, channel bindings (which Slack channels or webhooks can trigger it), RBAC restrictions (which user roles can interact with it), and approval rules (which action types require human sign-off before execution). The guardrails are configured here — this is the PSF D1 and D5 work. Every tool the agent has access to should be the minimum required for the use case. Never grant an agent write access to a system it only needs read access to.

⏱ 4–8 hours per agent

Observability configuration

Technical

Owner: CAOP operator.

Configure Trinity's OpenTelemetry exporter to send traces to your observability stack (or the client's). Set up dashboards for agent execution volume, error rate, approval queue depth, and LLM cost per agent. Configure alerting for anomalous execution patterns — an agent that executes 10× its expected daily volume should trigger a review before it consumes significant API spend. The tamper-evident audit log is on by default — confirm it is being written to durable storage.

⏱ 2–4 hours

Approval queue configuration

Technical

Owner: CAOP operator + client nominated reviewer.

Configure the approval queue for every action type that is irreversible or customer-facing. Assign the client's nominated reviewer to these approval tasks. Walk the reviewer through the approval interface: how they receive notification, what information is presented, how to approve or reject, and what happens to rejected actions. This is the human oversight layer — get the client's reviewer comfortable with it before going live.

⏱ 2–3 hours including client walkthrough

Run a supervised pilot

Week 4–6

Before handing the deployment to the client for unsupervised use, run a two-week supervised pilot. Every agent action goes through the approval queue. You review alongside the client. You build the evidence that the agent is behaving as expected and that the client is comfortable with the oversight model.

Pilot period monitoring

Process

Owner: CAOP operator.

During the pilot, review the observability dashboard daily. Check approval queue items together with the client's reviewer. Flag any agent outputs that are unexpected, off-topic, or incorrect. Document each instance in your incident log — not because they are necessarily serious, but because the client needs to see that you are tracking the agent's behaviour systematically. A clean incident log is evidence of a well-configured deployment; a few documented minor incidents with appropriate responses is evidence of a mature monitoring practice.

⏱ 30–60 mins/day during pilot

Pilot review and sign-off

Deliverable

Owner: CAOP operator + client stakeholder.

At the end of the pilot, produce a pilot review document: number of tasks processed, number of approval queue decisions, number of incidents, cost vs. estimate, client reviewer feedback. Get written sign-off from the client stakeholder before moving to production operation. This sign-off is your evidence of informed consent for the production deployment — it matters if a regulator ever asks how you validated the deployment before going live.

⏱ Half day

Operate as a managed service

Ongoing

Deployment is only the beginning. Ongoing managed operation covers monitoring, incident response, model updates, capability expansion, and periodic assurance work where the client needs it.

Monthly managed service deliverables

Service

Owner: CAOP operator.

Each month: review observability dashboards for drift in agent behaviour (volume, error rate, cost per run), review any incidents and their resolutions, update agent configurations in response to client feedback or changing requirements, review LLM model versions and assess whether an update is appropriate, and produce a one-page operational summary for the client stakeholder. This summary is the evidence that the managed service is active and that the client is getting value.

⏱ 4–6 hrs/month per deployment

Quarterly compliance audit (CAIA)

Service

Owner: CAIA-certified auditor.

Where the client requires periodic assurance, produce a quarterly review covering PSF domain status for each deployed agent, audit-log review for the period, incidents and root causes, confirmation that data-protection controls remain effective, and recommendations for the next quarter.

⏱ 1–2 days per client per quarter

Capability expansion

Growth

Owner: Governance lead + CAOP operator.

Once the first agent is stable, revisit the discovery backlog and scope the next use case only where there is clear value, a named owner, and enough governance capacity to operate it well. Reuse the same scoping, deployment, and pilot discipline each time.

⏱ Scoped per use case

Pricing reference

These are reference ranges in GBP. Adjust for your market, client size, and the complexity of the client's environment. Setup fees reflect the one-time cost of deployment, configuration, and the supervised pilot. Monthly fees reflect ongoing managed operation, monitoring, and monthly reporting.

Starter Deployment

Setup: £4,500–£7,500

Monthly: £800–£1,500/mo

✓1–2 agents
✓CAOP operator certification on team
✓Docker + Redis provisioning
✓2-week supervised pilot
✓Monthly operational summary
✓Approval queue configuration

First agent deployment for SME clients. Good reference engagement for building the practice.

Most common

Regulated Client Package

Setup: £10,000–£18,000

Monthly: £2,000–£3,500/mo

✓3–5 agents
✓CAOP + CAIA certified team
✓Full PSF compliance configuration
✓Quarterly audit reports
✓OTel observability stack
✓Incident response SLA
✓Client reviewer training

Financial services, healthcare, legal, and government clients with compliance obligations.

Enterprise Sovereign Platform

Setup: £25,000–£50,000

Monthly: £5,000–£10,000/mo

✓Unlimited agents
✓Kubernetes deployment
✓Full team certification
✓Custom compliance reporting
✓Executive briefings
✓Priority incident response
✓Annual PSF gap assessment

Large enterprises requiring sovereign infrastructure across multiple departments or divisions.

Certifications for the practice

Three PAI credentials are relevant to this kind of work: CAOP for operation, CAIG for governance, and CAIA for evidence-based review.

CAOP

Certified Agent Operator

Operating production agent systems: PSF compliance, observability, approval queues, incident response, model updates.

Practice role: operation

View certification →

CAIG

Certified AI Governance Professional

Governance of production AI systems: accountability, risk tiering, control design, and operating evidence.

Practice role: governance

View certification →

CAIA

Certified AI Auditor

Producing compliance audit reports: PSF gap assessment, audit log review, regulatory evidence packages, board-level reporting.

Practice role: review

View certification →

Join the PAI MSP Programme

The PAI MSP Programme supports managed service providers building an AI agent practice. Review the discovery framework, proposal templates, client briefing decks, and credential pathways used across the programme.

MSP programme details Trinity PSF assessment →