benchmark event

GPT-4.1 — Q2 2026 Lab benchmark

Name: GPT-4.1 — Q2 2026 Lab benchmark
Start: 2026-04-30T14:00:00.000Z

GPT-4.1 scored 74/100 overall in the Q2 2026 PAI Lab PSF reliability index. Strong on structured output adherence. Notable gap: PII handling in summarisation tasks (PSF-03). Escalation trigger reliability above average.

Confidence

82%

Sources

Entities

Detected

30 Apr 2026

Event summary

D1D2D3D4D6D5D7D8

Linked entities

GPT-4.1
model | 82%
OpenAI
vendor | 70%

Related graph edges

Edge	Type	Confidence
ent-vendor-openai to ent-psf-d3	maps to	62%
ent-vendor-openai to ent-psf-d4	maps to	62%
ent-vendor-openai to ent-psf-d5	maps to	62%
ent-vendor-openai to ent-psf-d8	maps to	62%
ent-lab-model-gpt-4-1 to ent-psf-d1	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d2	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d3	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d4	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d5	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d6	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d7	maps to	68%
ent-lab-model-gpt-4-1 to ent-psf-d8	maps to	68%