Production AI Institute · PSF v1.1 open standard
AI Right-To-KnowAI Data Use IndexCheck My AI ToolsPolicy Change WatchAgent ReadinessPublic BenchmarkContactGlobal standard · Worldwide
Insights / PSF Assessmentcodex-sites-plugins · June 2, 2026

OpenAI Codex Sites & Role Plugins in Production: A PSF Domain Assessment

On June 2, 2026 OpenAI expanded Codex beyond coding with Sites (hosted workspace apps), Annotations for in-place refinement, and six role-specific plugins spanning 62 enterprise apps and 110 skills. The release accelerates knowledge-work agents; governance for hosted URLs and SaaS connectors remains deployment-layer work.

Production AI Institute · 11 min read · Updated June 2026
Independence disclosure:The Production AI Institute has no commercial relationship with OpenAI. This assessment is based on OpenAI's June 2, 2026 product announcement and related June 1, 2026 AWS partnership documentation. OpenAI was not consulted in preparing this assessment.

June 2, 2026 marks a platform shift for Codex: OpenAI now targets knowledge workers with Sites (preview interactive apps shared inside a workspace via URL), Annotations (section-level edits on documents, decks, spreadsheets, and Sites), and six role-specific plugins that bundle connectors to tools such as Salesforce, Snowflake, Figma, and HubSpot with roughly 110 automated skills. OpenAI reports about five million weekly Codex users, with non-developers at roughly 20 percent and growing three times faster than engineers.

This assessment is distinct from our Codex CLI 0.134 developer release (May 26, 2026) and our OpenAI on Amazon Bedrock GA (June 1, 2026). Those releases address permission profiles and AWS-hosted inference. The June 2 bundle addresses how enterprise teams orchestrate SaaS data and publish lightweight hosted applications without writing front-end code.

Release scope assessed

ArtifactScopeDate
Role-specific plugins6 bundles, 62 apps, 110 skills2026-06-02
SitesPreview, Business & Enterprise, OpenAI-hosted2026-06-02
AnnotationsSection-level refinement on knowledge artifacts2026-06-02
Admin controlsWorkspace app permissions; Sites enablement2026-06-02

PSF domain scorecard

Ratings reflect capabilities documented at the June 2, 2026 launch. Full domain definitions are in the Production Safety Framework.

DomainRating
D1Input GovernancePartial
D2Output ValidationPartial
D3Data ProtectionPartial
D4ObservabilityGap
D5Deployment SafetyPartial
D6Human OversightStrong
D7SecurityGap
D8Vendor ResiliencePartial
D1

Input Governance

Partial

Role plugins bundle 62 SaaS connectors and 110 skills, but Codex still ingests unstructured CRM, warehouse, and design-tool payloads unless admins scope app permissions and harnesses tag trusted versus untrusted content.

OpenAI's June 2, 2026 announcement positions six role-specific plugins (sales, data analytics, creative production, product design, public equity investing, and related bundles) that install without custom code. Each plugin pre-wires popular enterprise apps such as Salesforce, Snowflake, Figma, and HubSpot. Business and Enterprise admins can control underlying app permissions in workspace settings, which is a meaningful enterprise input gate compared with ad hoc API keys. Plugins do not classify ticket text, spreadsheet cells, or Slack threads as trusted by default. Teams enabling cross-department orchestration should combine admin allowlists with the same XML or block scoping patterns we document for API agents.

Practitioner action: Publish a plugin allowlist per business unit before rollout. Tag CRM and ticket payloads as untrusted in prompts. Review OAuth scopes when enabling Snowflake or Salesforce connectors for non-engineering roles.
D2

Output Validation

Partial

Annotations let users refine a section of a document, slide, spreadsheet, or Site without regenerating the whole artifact; Sites still need deployment-layer validators before financial or customer-facing numbers ship.

Annotations extend the in-place editing model OpenAI previewed for developers to knowledge-worker artifacts. That improves human-guided correction loops for PSF Domain 2 compared with full-regeneration chat. Sites can turn analysis into dashboards, planners, and lightweight tools shared inside a workspace via URL. OpenAI does not document schema validation or automated policy graders on generated Sites before publish. Sales decks, models, and customer summaries produced through plugins still require reviewer sign-off and golden-set checks, especially when plugins pull live pipeline data.

Practitioner action: Require human review on any Site or export that influences pricing, compliance, or customer commitments. Run spot checks on plugin-generated spreadsheets against source systems. Block auto-send to external recipients until a second reviewer approves.
D3

Data Protection

Partial

Workspace-hosted Sites and 62 third-party connectors increase the volume and sensitivity of data Codex can read; enterprise admin controls help but do not replace classification and retention policy.

OpenAI states Sites roll out in preview for Business and Enterprise teams through the Codex app, shareable with anyone in the workspace via URL, and are hosted by OpenAI. That concentrates intellectual property and operational metrics on OpenAI infrastructure unless contracts specify zero retention or regional routing (for example via the June 1, 2026 Amazon Bedrock path in our Bedrock assessment). Plugins aggregate data from CRM, analytics, and design systems in a single agent session, which aids productivity and expands blast radius if a session is over-permissioned. OpenAI reports non-developers are roughly 20 percent of five million weekly Codex users and adopting three times faster than engineers, which raises the need for role-based data policies beyond engineering norms.

Practitioner action: Map which plugins each role may install. Disable Sites preview for regulated data classes until legal review completes. Prefer Bedrock or contractually restricted API routes for PHI and financial records. Audit workspace Site URLs for accidental external sharing.
D4

Observability

Gap

Workspace admin settings govern app permissions and Sites enablement, but OpenAI does not publish SIEM-ready audit exports for plugin tool calls, Site views, or per-skill invocations comparable with AWS CloudTrail on Bedrock.

The June 2 release emphasizes no-code plugin installation and collaborative Sites, not centralized telemetry for security operations centers. Practitioners can infer usage from OpenAI workspace billing and admin consoles, yet PSF Domain 4 expects correlation IDs across CRM queries, warehouse pulls, and generated artifacts. Compare with our Codex CLI 0.134 assessment: developer-oriented tracing improved for local exec flows, but enterprise knowledge-worker sessions lack the invocation logging Bedrock documents for model calls. Teams rolling plugins to sales and finance need their own logging wrapper or DLP integration before promoting unattended schedules.

Practitioner action: Instrument approved plugins with your SIEM via vendor-native audit logs (Salesforce, Snowflake) where available. Log Codex session IDs in internal ticketing when agents act on customer records. Alert on new plugin installs at org level.
D5

Deployment Safety

Partial

Preview Sites and curated OpenAI plugins reduce reckless custom integrations, but production promotion still needs staged rollout, cost caps, and explicit disable switches for hosted apps.

Sites remain preview for Business and Enterprise, which signals OpenAI has not finalized SLA, abuse, or data-handling terms for customer-hosted interactive apps. Enterprise admins must enable Sites in admin settings, providing a coarse deployment gate. Role plugins ship as curated bundles rather than arbitrary MCP servers, which lowers the risk of unaudited community tools relative to open plugin marketplaces. The tradeoff is velocity: non-developers can publish workspace-visible apps without a traditional CI pipeline. PSF Domain 5 expects canary cohorts, rollback plans, and kill switches when a plugin misbehaves across 62 connected apps.

Practitioner action: Pilot one plugin per department with a written rollback procedure. Keep Sites disabled until security reviews URL sharing and retention. Cap daily token spend per workspace and per role. Document how to revoke a compromised OAuth connector without disabling all of Codex.
D6

Human Oversight

Strong

Annotations and section-level refinement are explicit human-in-the-loop primitives for knowledge work; irreversible plugin actions still need consequence-based approval rules.

OpenAI designed Annotations so users point Codex at a specific region of a document, deck, spreadsheet, or Site and request targeted edits. That is one of the stronger oversight affordances in the June 2 bundle for practitioners who review agent output before wide distribution. Plugins automate follow-ups, pipeline summaries, and design variations, which can compress cycle time but also reduce natural pause points. PSF Domain 6 still requires policy: any action that sends customer email, modifies production data, or commits spend needs human approval regardless of annotation UX.

Practitioner action: Train sales and analytics teams on annotation-first review before sharing Sites. Pair plugin rollouts with CAIS-style escalation matrices. Block plugins from sending external communications without named approvers.
D7

Security

Gap

Sixty-two pre-integrated SaaS apps multiply OAuth supply-chain and over-permission risk; Sites hosted on OpenAI expand the attack surface for workspace URL leakage and prompt injection via shared canvases.

OpenAI curates plugins today and plans partner-authored plugins later, which will further diversify trust boundaries. Each connector (Salesforce, Snowflake, Figma, HubSpot, and dozens more) is a new credential and data exfiltration path if a session is hijacked or poisoned via indirect injection in a shared Site. Workspace URL sharing improves collaboration but increases the impact of a mis-scoped link forwarded outside the intended audience. Security teams should treat Codex enterprise mode as a privileged integration hub, not a chat UI. Align reviews with our OpenAI Agents SDK assessment on tool blast radius.

Practitioner action: Run least-privilege OAuth for every plugin app. Red-team Sites with untrusted viewer comments. Monitor for new plugin directory entries weekly. Require MFA on Business and Enterprise workspaces.
D8

Vendor Resilience

Partial

Plugins orchestrate existing SaaS investments rather than replace them, but OpenAI-hosted Sites and the Codex app channel create dependence on OpenAI availability and policy changes.

The June 2 narrative positions Codex as an orchestration layer above Salesforce, Snowflake, and design tools, which preserves exit paths to those vendors if Codex is unavailable. Sites, however, are hosted by OpenAI and distributed via workspace URLs, so business logic and UI state may not port cleanly to self-hosted stacks. OpenAI also ships the same capabilities through CLI and desktop clients per third-party reporting, which helps continuity for technical teams. Practitioners should export critical Site definitions and maintain parallel workflows in native SaaS UIs for outage scenarios.

Practitioner action: Document fallback procedures when Codex is down (native Salesforce reports, etc.). Avoid Sites as the sole system of record for compliance artifacts. Revisit Bedrock routing for model inference resilience while accepting Sites remain OpenAI-hosted.

Certification and stack context

Teams enabling Codex plugins for sales, finance, or analytics should align connector governance with AIDA (AI Deployment Associate) deployment checklists. Hosted Sites and cross-app orchestration benefit from CLOE (Certified LLM Operations Engineer) practices on logging, cost caps, and environment separation. Plugin blast radius and oversight design map directly to CAIS (Certified AI Safety Specialist) tool-access guidance. Compare harness-level controls in our OpenAI Agents SDK assessment when mixing API agents with Codex desktop workflows.

Sources

Scores are structured assessments against PSF v1.1, not empirical PAI Lab multi-run results. Revisit when Sites exit preview or when OpenAI publishes audit-log specifications for plugin invocations.

Use this assessment against your own deployment. The free AIDA exam checks PSF readiness in about 20 minutes.

Verify your deployment — free AIDA exam →
Apply the standard

Turn the evidence into production practice.

Use the PSF, research library, and Lab material to review your own deployment. Credentials are available when a client, employer, or regulator needs public proof.

The Production AI Brief