Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
Insights / PSF AssessmentCodex CLI 0.134.0 · May 26, 2026

OpenAI Codex CLI 0.134.0 in Production: A PSF Domain Assessment

OpenAI shipped Codex CLI 0.134.0 on May 26, 2026 with profile-first permissions, richer MCP controls, and searchable local history. The release strengthens how teams govern coding agents in the terminal; eval harnesses and semantic validation remain yours to build.

Production AI Institute · 11 min read · Updated May 2026
Independence disclosure: The Production AI Institute has no commercial relationship with OpenAI. This assessment is based on the May 26, 2026 Codex CLI changelog, the rust-v0.134.0 GitHub release, and OpenAI engineering posts through May 27, 2026. OpenAI was not consulted in preparing this evaluation.

Codex CLI 0.134.0 (published May 26, 2026) is a versioned agentic coding release with production-relevant controls: --profile as the primary permission selector, searchable local conversation history, per-server MCP environment routing, OAuth options for streamable HTTP MCP servers, and stricter rejection of legacy profile configurations. The prior week shipped 0.133.0 with goals on by default, managed requirements.toml permission profiles, and expanded plugin lifecycle hooks.

This assessment is separate from our OpenAI Agents SDK evaluation, which covers hosted orchestration APIs. Codex targets developers and platform teams running terminal, IDE, and remote-control agents against real repositories. For organisations standardising on OpenAI for production engineering, the PSF question is whether 0.134.0 closes governance gaps or only surfaces more local sensitive data.

Release scope assessed

CapabilityVersionDate
Profile-first CLI/TUI/sandbox; legacy profile rejectionCodex CLI 0.134.02026-05-26
Local conversation history searchCodex CLI 0.134.02026-05-26
Goals default; managed permission profilesCodex CLI 0.133.02026-05-21
exec resume with --output-schemaCodex CLI 0.132.02026-05-20

PSF domain scorecard

Ratings reflect Codex CLI 0.134.0 and dependent May 2026 releases documented in OpenAI primary sources. Domain definitions: Production Safety Framework.

DomainRating
D1Input GovernancePartial
D2Output ValidationPartial
D3Data ProtectionPartial
D4ObservabilityPartial
D5Deployment SafetyStrong
D6Human OversightPartial
D7SecurityStrong
D8TestingGap
D1

Input Governance

Partial

CLI 0.134.0 makes --profile the primary permission selector and rejects legacy profile configs, but untrusted repo content and MCP payloads still need explicit scoping before execution.

The May 26, 2026 release promotes --profile across CLI, TUI permissions, and sandbox flows, with migration guidance when legacy profile configs are detected. Managed requirements.toml permission profiles (shipped in 0.133.0) let enterprises define inherited deny-read globs and approval requirements. Connector tool schemas preserve local $ref structures and compact oversized schemas before exposure, which reduces accidental over-broad tool surfaces. None of this classifies inbound prompts or repository files as trusted versus untrusted by default: AGENTS.md and skills still load from the workspace unless profiles block paths.

Practitioner action: Pin a production profile in CI and local developer machines. Wrap retrieved or ticket-sourced text in explicit untrusted blocks in prompts. Deny-read sensitive paths in requirements.toml before enabling MCP servers that can read the repo.
D2

Output Validation

Partial

Structured output-schema on exec resume and schema-stable MCP tools help contract-bound automations; semantic validation of agent actions remains external.

Codex 0.132.0 added --output-schema support on exec resume so long-running sessions can enforce JSON contracts on resumed automations. Version 0.134.0 improves connector schema fidelity for MCP tools. OpenAI's May 27, 2026 Tax AI case study describes eval-backed validation gates before production promotion, but those harnesses are application patterns, not defaults in the CLI. For PSF Domain 2, format compliance is achievable; content safety and business-rule validation still require practitioner-defined graders.

Practitioner action: Define output schemas for every codex exec automation that mutates billing, customer, or production config. Add a secondary validator (policy script or smaller model pass) before merging pull requests opened by Codex.
D3

Data Protection

Partial

Local conversation history search keeps more context on disk; enterprise gates and workspace usage-limit messaging improve policy visibility but do not replace data-classification controls.

Version 0.134.0 introduces searchable local conversation history with case-insensitive previews, which aids incident review but increases the sensitivity of disk artifacts on developer machines. Enterprise requirement gates and workspace-specific usage-limit copy help operators explain credit and spend-cap failures without exposing raw prompts in logs. Cloud and ChatGPT-authenticated flows still process prompts on OpenAI infrastructure unless you deploy air-gapped or contractually restricted configurations. Practitioners in regulated sectors should map which turns are local-only versus cloud-backed before enabling remote-control or shared threads.

Practitioner action: Encrypt developer laptops that run Codex with local history enabled. Use enterprise auth and document retention for cloud threads. Redact customer identifiers in prompts even when using API keys with no-training terms.
D4

Observability

Partial

Improved websocket tracing, turn-start analytics, and history search strengthen operator visibility; SIEM-ready export still requires your pipeline.

The 0.134.0 changelog cites tracing and analytics for websocket requests, turn starts, and remote compaction v2. Goals (default since 0.133.0) expose progress across turns, which helps humans see multi-step agent state. Production teams still need correlation IDs into CI systems, cost dashboards tied to profile and model version, and retention aligned to compliance schedules. Compare with our OpenAI Agents SDK assessment: Codex is stronger for interactive developer observability than for unattended fleet telemetry.

Practitioner action: Log model version, profile name, and thread ID on every codex exec invocation in CI. Ship traces to OpenTelemetry or your existing APM. Review local history search access on shared build agents.
D5

Deployment Safety

Strong

Sandbox profiles, codex doctor diagnostics, managed network proxy for Node tools, and explicit profile migration reduce unsafe default deployments relative to earlier CLI generations.

Codex ships sandbox execution with profile-aware permissions on macOS, Linux, and Windows (including VT fixes in 0.134.0 for Windows TUI). The codex doctor command surfaces runtime, auth, terminal, network, and config health. Managed requirements.toml profiles let security teams publish one policy artifact developers must consume. Remote-control reconnect and compaction retries improve reliability of long-running operational agents without silently widening permissions. Teams should still stage profile changes and test sandbox denials before wide rollout.

Practitioner action: Run codex doctor in CI bootstrap. Version-control requirements.toml per repo. Block --profile overrides in production pipelines except from a signed config bundle.
D6

Human Oversight

Partial

Goals, plan-mode question flows, and approval modes support human checkpoints; autonomous goal continuation can still burn tokens unless usage limits are configured.

Goals became default in 0.133.0 with dedicated storage and progress tracking. Plan-mode fixes in recent releases prevent accidental submission on modified Enter keys. OpenAI documents pausing goal continuation on usage limits and repeated blockers. For high-consequence repos, combine approval modes with manual PR review rather than trusting goal completion alone. The May 27 Tax AI post emphasizes practitioner corrections as structured training signals: that is an oversight pattern, not an automatic gate.

Practitioner action: Require human approval before codex merges to main. Cap goal extension turns in production repos. Map irreversible tool calls to always-ask approval in the active profile.
D7

Security

Strong

Permission profiles with inheritance, read-only concurrent MCP tools, per-server MCP environments, and OAuth options for streamable HTTP servers materially improve supply-chain containment for agent tooling.

Version 0.134.0 routes MCP servers through explicit environments, supports OAuth on streamable HTTP MCP servers, and allows parallel execution only when tools advertise readOnlyHint. MITM hook configuration and runtime enforcement landed in the 0.131.x series. Windows sandbox integration tightened deny-read and write-root resolution. These controls align with PSF Domain 7 expectations for least-privilege tool access better than most coding agents in our comparison set, provided teams actually publish restrictive profiles instead of running default-allow locally.

Practitioner action: Audit every MCP server URL and OAuth scope quarterly. Disable write-capable MCP tools in production profiles unless tied to a named automation. Align reviews with CAIS tool-access guidance.
D8

Testing

Gap

Codex documents eval and harness patterns in engineering posts but does not ship repository-integrated regression suites or policy snapshots in the CLI itself.

OpenAI's self-improving agent narrative depends on targeted eval YAML, regression suites, and bounded Codex task environments in customer repos. Those are conventions practitioners must build. The CLI does not fail CI when behaviour drifts after a model or profile upgrade. PSF Domain 8 maturity requires golden-set runs on every release channel bump (0.134.0 today, 0.133.0 last week).

Practitioner action: Maintain a fixture repo with frozen prompts and expected artifacts. Run codex exec against it on every CLI upgrade. Block promotion when diff exceeds tolerance.

Certification and stack context

Teams deploying Codex in CI should align runbooks with AIDA (AI Deployment Associate) checklists before granting write-capable profiles on production branches. Long-running Codex automations benefit from CLOE (Certified LLM Operations Engineer) practices for model pinning, cost telemetry, and incident response. MCP and plugin breadth should be reviewed against CAIS (Certified AI Safety Specialist) tool-access guidance. Compare terminal agents in our agent framework comparison and the contemporaneous Cursor 3.5 Automations assessment when mixing vendor coding agents.

Sources

Scores are structured assessments against PSF v1.1, not empirical lab results. Revisit when OpenAI ships enterprise-wide policy enforcement for Codex remote-control fleets or changes default goal continuation behaviour.

Apply the standard

Turn the evidence into production practice.

Use the PSF, research library, and Lab material to review your own deployment. Credentials are available when a client, employer, or regulator needs public proof.

The Production AI Brief