What is scored
The overall readiness score is a 0-100 composite across the eight Production Safety Framework domains. Each domain receives an equal weight. The first release combines self-assessment answers with repository evidence signals when the user provides a public GitHub URL or a private repository file manifest.
Input boundary
Scope, allowed sources, abuse controls, and prompt injection boundaries.
Output validation
Contracts, schemas, refusals, confidence thresholds, and failure paths.
Data stewardship
Classification, minimisation, retention, redaction, and vendor data access.
Observability
Traces, evals, incidents, drift, operational review, and production metrics.
Deployment control
Versioning, release gates, canaries, rollbacks, and reproducibility.
Human oversight
Autonomy limits, approvals, escalations, overrides, and audit trails.
Security posture
Tool permissions, secrets, agent threat testing, and integration risk.
Ecosystem resilience
Provider fallbacks, dependency inventory, portability, and degraded modes.