The professional standard for production AI deployment
Verify a credentialFor organisationsPartner ProgrammeFor nonprofits & NGOsContact
PSF AssessmentInfrastructure6 Strong · 2 Partial · 0 Gap

Trinity in Production: A PSF Domain Assessment

Trinity by Ability.AI is a self-hosted, open-source agent runtime designed from the ground up for sovereign production deployments. Unlike framework libraries that require practitioners to add every production safety control, Trinity ships with governance as a core feature. This assessment evaluates Trinity against the eight PSF domains to determine where it covers practitioners natively, and where additional controls are still required.

What is Trinity?Trinity is a Docker-deployable agent runtime that runs inside your own infrastructure. Agents connect to users via Slack, Telegram, WhatsApp, Claude Code (via MCP), and public webhooks. Agent state is persisted in your GitHub repository. The platform ships with built-in approval queues (human-in-the-loop), RBAC, OpenTelemetry tracing, execution replay, per-agent cost tracking, and a hash-chained tamper-evident audit log. It has been independently security-audited by UnderDefense. Nine packaged solution verticals are available via trinity install <scaffold>.
The sovereign infrastructure principleTrinity's most important production safety property is not a feature — it is an architectural choice. Because Trinity runs inside your own perimeter, the entire vendor-dependency risk class that affects cloud-hosted agent platforms simply does not apply. Data protection is structural rather than contractual. Vendor resilience is architectural rather than promised. Security audit coverage extends to your running instance, not just the vendor's platform. This principle applies across every PSF domain and elevates Trinity's baseline safety posture relative to any cloud-hosted alternative.

Assessment Summary

DomainRatingSovereign advantage
D1Input GovernancePartialConfigurable
D2Output ValidationPartialConfigurable
D3Data ProtectionStrongStructural
D4ObservabilityStrongConfigurable
D5Deployment SafetyStrongConfigurable
D6Human OversightStrongConfigurable
D7SecurityStrongStructural
D8Vendor ResilienceStrongStructural
D1

PSF Domain 1: Input Governance

Partial

Trinity provides structural input governance through per-agent scope definitions and RBAC controls over which users and channels can trigger each agent. Native prompt injection resistance and PII detection require practitioner-added controls.

Trinity enforces input governance at the deployment and routing layer: each agent is scoped to specific channels (Slack, Telegram, WhatsApp, webhook), specific RBAC roles (admin, creator, operator, user), and specific task classes through its configuration. This means that an agent deployed to handle HR queries cannot be triggered by an external webhook intended for a customer support agent, and a user-role caller cannot invoke an operator-privileged workflow. This structural scoping is a meaningful input governance control that most frameworks lack entirely. Where Trinity does not yet provide native controls is at the semantic input layer: there is no built-in prompt injection detection, no PII classifier, and no mechanism to validate that the content of a user message conforms to an expected structure before it enters the agent's context. Practitioners deploying Trinity into environments where users submit free-text inputs must implement these controls at the application layer — before the message is routed to the Trinity agent.

Practitioner actionAdd a pre-processing step at each public-facing channel that validates input structure and screens for adversarial patterns before the message reaches the Trinity runtime. For Slack and Teams channels, implement an application-layer gateway that runs PII detection (Presidio or equivalent) and basic injection resistance before forwarding the payload to Trinity's webhook. Document the permitted input types for each deployed agent in your agent specification sheet.
D2

PSF Domain 2: Output Validation

Partial

Trinity's built-in approval queue provides human-in-the-loop review of consequential outputs before they act on downstream systems. Automated schema validation of agent outputs is not natively enforced and requires practitioner implementation.

Trinity's approval queue is one of its most important production features. Any agent action can be configured to require human approval before execution — the agent pauses, presents the proposed action to a designated reviewer, and only proceeds once approved. This is a meaningful output control for irreversible or high-stakes actions: an agent about to send an email, update a customer record, or trigger a financial transaction can be held at the approval gate until a human has verified the output is correct. What the approval queue does not provide is automated output validation: there is no built-in mechanism to assert that an agent's response conforms to a defined JSON schema, that it does not contain PII that should be redacted before delivery, or that it falls within permitted content categories. These semantic output controls must be added by the practitioner. For multi-agent workflows within Trinity, intermediate outputs flow between agents without automatic schema enforcement — a corrupted intermediate output can propagate to subsequent agents.

Practitioner actionUse Trinity's approval queue for all agent actions that are irreversible or customer-facing. For automated pipelines where human approval is not practical, add an output validation step as a dedicated agent in the workflow — this agent's sole role is to validate the prior agent's output against a defined schema before passing it forward. For structured output requirements, specify the expected schema in the agent's system prompt and use an LLM with native JSON mode (e.g., GPT-4.1 with json_schema response format) to enforce structure.
D3

PSF Domain 3: Data Protection

Strong

Trinity's self-hosted, sovereign architecture provides the strongest possible foundation for data protection. Data never leaves your perimeter. Agent state persists in your GitHub. No vendor has access to your operational data.

Trinity's data protection posture is fundamentally different from cloud-hosted agent platforms, and the difference is structural rather than configurable. Because Trinity runs inside your own infrastructure (Docker, on-premise or private cloud), every piece of data processed by an agent — user inputs, agent context, tool call payloads, outputs — stays within your network perimeter. There is no Trinity cloud that receives copies of your operational data. There is no telemetry pipeline sending agent conversation logs to Ability.AI's servers. Agent state is persisted in your own GitHub repository, which you control and can subject to your existing data governance controls. This architecture makes Trinity compliance-friendly by default for regulated data categories. An MSP deploying Trinity for a healthcare client can assert that no patient data leaves the client's environment. A financial services firm can assert that client financial data is processed only within their regulatory perimeter. These assertions are structurally true rather than dependent on vendor data processing agreements. The tamper-evident audit log (hash-chained, append-only, CSV/JSON export) provides the evidence trail needed for regulatory audit.

D4

PSF Domain 4: Observability

Strong

Trinity provides production-grade observability natively: OpenTelemetry tracing, per-agent cost tracking, execution replay, and a hash-chained audit log. This is the most complete native observability of any agent runtime we have assessed.

Observability is where Trinity most clearly differentiates from other agent runtimes. Most frameworks provide development-grade logging that requires significant additional tooling to become production-grade. Trinity ships with OpenTelemetry instrumentation, which means traces are emitted in a standard format compatible with any OTLP-compatible backend (Datadog, Grafana Tempo, Honeycomb, Jaeger, and others). Every agent execution is captured as a structured trace with per-step spans. In addition to OTEL tracing, Trinity provides execution replay — the ability to re-run a past agent execution against its original inputs for debugging or audit purposes. Per-agent cost tracking gives practitioners a clear view of LLM spend per workflow and per agent, which is essential for client billing in MSP deployments. The tamper-evident audit log is particularly significant for regulated environments: it is hash-chained and append-only, meaning any retrospective alteration is cryptographically detectable. This is the level of audit evidence that financial services, healthcare, and government clients require. For MSPs, this means you can provide clients with a verifiable record of every action their AI agents took.

D5

PSF Domain 5: Deployment Safety

Strong

Trinity provides per-agent guardrails, Docker-based isolation, channel scope enforcement, and approval queues as deployment safety controls. The self-hosted model eliminates shared infrastructure risks. Action budgets require explicit practitioner configuration.

Trinity's deployment safety model starts from a fundamentally safer baseline than cloud-hosted runtimes. Because each Trinity deployment is isolated within Docker on your own infrastructure, there is no shared-tenancy risk: your agents cannot be affected by other customers' workloads, and your credentials are not stored on a shared platform. The per-agent guardrail system constrains each agent's permitted tools, data access, and operational scope at configuration time — an agent scoped to read customer support tickets cannot be reconfigured at runtime to access financial records. Channel scope enforcement adds another layer: an agent deployed to a specific Slack channel cannot be invoked from a webhook or a different channel without explicit reconfiguration. The approval queue functions as a production circuit breaker: for any action type classified as consequential, the workflow pauses for human review rather than executing autonomously. Where practitioners must still add controls is at the resource budget layer: Trinity does not natively enforce per-run LLM token budgets or maximum execution time limits. For workflows that could theoretically run indefinitely (e.g., research agents with recursive tool use), practitioners should configure explicit timeout and cost-ceiling controls at the deployment layer.

Practitioner actionDefine maximum execution time and LLM token budgets for each deployed agent and enforce them at the Trinity configuration layer or at the infrastructure layer (container resource limits). For agents that use external APIs, implement rate limiting at the tool level to prevent runaway API consumption. Document the expected cost-per-run for each agent and configure alerting when actual costs exceed the expected range by more than 20%.
D6

PSF Domain 6: Human Oversight

Strong

Trinity's approval queue is a first-class runtime primitive, not an afterthought. Human oversight can be enforced at any workflow step, for any action category, with reviewer assignment and audit trail. This is the most complete native oversight mechanism we have assessed.

Most agent frameworks treat human oversight as a design pattern — something the practitioner builds on top of the framework by structuring their workflow to include a pause. Trinity treats it as a runtime primitive. The approval queue is a built-in component of the agent runtime, not an external add-on. When an agent reaches an action requiring approval, it automatically creates an approval task, assigns it to the configured reviewer (by role or by identity), and holds execution until the reviewer approves or rejects. Rejection can trigger alternative workflow paths. Approvals are recorded in the tamper-evident audit log with the reviewer's identity and timestamp. This design makes it structurally easy to comply with the PSF Domain 6 requirement that consequential actions require human review before execution — the infrastructure for that requirement is already in place. Practitioners are not required to build a bespoke pause-and-notify mechanism; they configure which action types require approval, and Trinity handles the rest. For MSPs, this means you can credibly tell clients that their AI agents cannot take any irreversible action without a named human approving it — and provide the audit log entry as evidence.

D7

PSF Domain 7: Security

Strong

Trinity has been independently security-audited by UnderDefense. RBAC with four defined roles enforces least-privilege access. The self-hosted model eliminates the SaaS platform attack surface. Tamper-evident logging supports incident investigation.

Trinity's security posture rests on four pillars. First, independent audit: UnderDefense conducted a formal security assessment of Trinity, which distinguishes it from frameworks that are self-asserted as secure. Second, RBAC: the four-role model (admin, creator, operator, user) enforces least-privilege access with clear separation between those who configure agents (creators), those who operate them (operators), and those who use them (users). Third, self-hosted architecture: because Trinity runs in your infrastructure, there is no SaaS platform boundary to compromise. An attacker who wants to access your agents' data must compromise your infrastructure directly — there is no vendor platform to target. Fourth, tamper-evident audit log: the hash-chained append-only audit log makes it cryptographically difficult to cover up a security incident after the fact, which both deters insider threats and supports post-incident forensics. For MSPs serving clients with security-sensitive requirements (financial services, healthcare, government, legal), Trinity's security credentials are materially stronger than any cloud-hosted alternative and are evidenced by third-party audit rather than vendor attestation alone.

D8

PSF Domain 8: Vendor Resilience

Strong

Trinity is open source and self-hosted. Agent state lives in your GitHub. Operations are not dependent on Ability.AI's uptime. You can fork, modify, and run the platform indefinitely without vendor involvement. This is the highest vendor resilience of any runtime we have assessed.

PSF Domain 8 addresses the risk that a vendor dependency becomes a single point of failure — that your AI agents stop working if a vendor changes pricing, discontinues a product, has an outage, or exits the market. Trinity eliminates this category of risk entirely through its architecture. Being open source, the codebase is available for inspection, forking, and modification under its licence. Being self-hosted, your running instance is not dependent on Ability.AI's infrastructure for operation — a network partition between your environment and Ability.AI's website has no effect on your deployed agents. Agent state being stored in your GitHub means that your agents' operational memory is under your version control, not in a vendor database. In a scenario where Ability.AI ceased to operate tomorrow, your deployed Trinity instances would continue running, your state would be intact, and you could maintain the platform from the open-source codebase indefinitely. For MSPs deploying AI infrastructure on behalf of clients, this resilience profile is commercially significant: you can credibly commit to multi-year managed service agreements without the caveat 'unless our vendor changes the product or pricing.'

Trinity for MSPs: the managed agent service model

Trinity's architecture maps directly to the MSP managed services model. You deploy and operate Trinity infrastructure on behalf of clients, within the client's own environment or a dedicated managed environment. The client never shares a multi-tenant cloud platform with other organisations. The client owns their data, their agent state, and their audit logs. You provide the operational expertise: deployment, configuration, certification, monitoring, and incident response.

This is a differentiated service that cloud-hosted AI platforms cannot replicate. When a client asks 'where does our data go?', the answer is 'nowhere — it stays in your environment.' When they ask 'what happens if the vendor changes pricing?', the answer is 'nothing — the platform runs independently of the vendor.' These are answers that only sovereign infrastructure makes possible.

The PSF certification path for Trinity MSP practitioners:

CAOP
Certified Agent Operator
Operates deployed Trinity instances, manages approval queues, monitors observability dashboards, handles escalations.
CAIG
Certified Agent Integration
Connects Trinity agents to client systems: CRMs, ERPs, ticketing systems, communication platforms.
CAIA
Certified AI Auditor
Audits Trinity deployments for compliance, reviews tamper-evident logs, produces audit reports for regulated clients.
Read the MSP agent deployment playbook →

When Trinity is the right choice

Trinity is the right runtime when any of the following requirements apply: regulated data that cannot leave the client's perimeter; compliance obligations requiring tamper-evident audit evidence; security posture that cannot accommodate a SaaS agent platform; multi-year service commitments that cannot tolerate vendor platform risk; or client size and workload that justify the operational overhead of self-hosted infrastructure.

It requires more initial setup than a hosted platform — you are deploying and operating infrastructure rather than clicking through a SaaS onboarding flow. For MSPs, this operational complexity is the service. The client is not paying for a SaaS subscription; they are paying for certified professionals who know how to deploy, configure, and operate sovereign AI infrastructure safely.

Trinity is less appropriate for exploratory deployments where quick iteration matters more than sovereignty, for small organisations without the IT infrastructure to run Docker-based systems, or for use cases with no regulated data and no compliance obligations — in those cases, a simpler hosted runtime may be adequate. The decision criteria should always start with the data and compliance requirements, not the technology preference.

Related assessments and guides

MSP Agent Deployment Playbook
Step-by-step guide to building a managed agent practice using sovereign infrastructure.
N8n PSF Assessment
Workflow automation alternative — useful alongside Trinity for non-AI orchestration.
OpenAI Agents SDK Assessment
Cloud-hosted comparison — useful for understanding the sovereignty trade-offs.
PSF Domain 8: Vendor Resilience
The full vendor resilience framework that makes sovereign infrastructure necessary.
Agent Framework Comparison
Side-by-side PSF assessment of all major frameworks, including Trinity.
Certified Agent Operator (CAOP)
The certification for professionals deploying and operating agent infrastructure in production.
From reading to credential

You understand the gaps.
Get the credential that proves it.

The AIDA examination tests applied PSF knowledge across all eight domains — exactly the gaps and strengths covered in this assessment. 15 minutes. No charge. Ever.

Start AIDA — free →CPAP practitioner credential
The Production AI Brief