See full changelog: https://github.com/openai/openai-dotnet/blob/OpenAI_2.11.0/CHANGELOG.md
OpenAI
Workspace plugin sharing now available in Codex
↗This release1 featureNew capabilitiesAI-tallied from the release notesPlugin sharing is now available by default for eligible ChatGPT Enterprise workspaces in Codex. Users can share local plugins with their workspace so teammates can install and use shared plugins from the Codex plugin directory. Workspace admins can disable plugin sharing using MDM or cloud-managed configuration.
Activity insights and shareable profile cards added
↗This release2 featuresNew capabilities2 enhancementsImprovements to existing features5 fixesBug fixesAI-tallied from the release notes# Codex app updates ### New features - Added activity insights and share cards to the [Profile section](/codex/app/settings#profile). You can review Codex usage highlights and save a profile card; sharing is available on consumer ChatGPT plans. ### Performance improvements and bug fixes - Improved Computer Use startup readiness and appshot error reporting. - Fixed browser and review UI issues, including fullscreen browser composer controls, hex color swatches, terminal scrollbar alignment, and animated diff stat alignment. - Expanded onboarding with more role choices so Codex can tailor first-run suggestions more accurately. - Additional performance improvements and bug fixes.
Lockdown Mode limits web access to reduce prompt injection risk
↗This release1 featureNew capabilitiesAI-tallied from the release notesLockdown Mode is now available to all logged-in users across account types and workspaces as an optional opt-in advanced security setting. It limits access to the web and external services to reduce risk of data exfiltration from prompt injection attacks. When enabled, ChatGPT restricts network-enabled capabilities including live web browsing, deep research, agent mode, file downloads, and some web-derived image support. Personal users can enable via Settings > Security. Workspace admins can configure access through workspace settings and role-based access controls.
Memory now updates automatically; Plus/Pro get 2x capacity
↗This release2 featuresNew capabilities1 enhancementImprovements to existing featuresAI-tallied from the release notesUpgraded memory so ChatGPT can better keep context up to date and reduce stale or contradictory saved memories. Memories are now updated automatically, with ChatGPT tracking important details to build on shared context. Users can review memories through sources or memory summary. For Plus and Pro users, ChatGPT can now remember more useful context with twice as much memory capacity. Rolling out to Plus and Pro users in the US today, with expansion to Free and Go plans and additional countries over the next few weeks. Legacy saved memories system available via Settings > Memory > Saved memories.
Ads rolling out to Free, Go plans in UK; paid plans remain ad-free
↗This release1 enhancementImprovements to existing featuresAI-tallied from the release notesAds are beginning to roll out for users on Free and Go plans in the UK. Plus, Pro, Business, Enterprise, and Education plans will remain ad-free.
Enterprise credit limits; remote-control pairing; multi-agent runtime choice
↗This release6 featuresNew capabilities6 fixesBug fixesAI-tallied from the release notesNew Features
- TUI controls now support F13-F24 keybindings, paste in searchable menus, and a compact reasoning-only status/title item (#25329, #25400, #25504).
- Enterprise/admin flows now show monthly credit limits and can apply cloud-managed config bundles, including EDU workspaces (#24812, #24617, #24619, #24620, #24622, #25963).
- Remote-control clients can start pairing and list or revoke controller grants through app-server v2 RPCs (#25675, #25785).
- Plugin workflows gained machine-readable
codex plugin list --jsonoutput and cached remote catalog suggestions (#25330, #25457). - Hosted web and image tools are available in more code-mode flows, with standalone web searches able to run in parallel (#25176, #25702, #25890, #25923).
- Multi-agent v2 keeps runtime choice with each thread and exposes cleaner follow-up and metadata defaults for spawned agents (#25266, #25636, #25720, #25721, #25722, #25841, #26114).
Bug Fixes
- Cancelling a submitted prompt before visible output now restores the draft, attachments, and collaboration mode for editing (#25316).
- Slash-command filtering and footer shortcut hints now reset or render according to the current UI state (#25492, #25625).
- Platform reliability improved for macOS app launches and Windows SQLite startup, thread resume, and sandbox setup refreshes (#25485, #25490, #25509, #25949).
- Plugin loading preserves app manifest order, deduplicates local/remote curated installs, and treats malformed
skillsfields as warnings (#25491, #25681, #25717, #25782). - Permission requests and approvals now carry environment identity, and managed MITM proxying exports readable CA bundles to child commands (#25850, #25858, #25862, #22668).
- Local session history is safer for compressed rollouts, renamed titles, pathless side-chat reloads, and stack-heavy startup/config rebuilds (#25087, #25624, #25661, #25814, #25844, #25847).
Documentation
- Added app-server docs and generated schema updates for monthly credit limits, remote-control RPCs, and environment-scoped permission approvals (#24812, #25675, #25785, #25862).
- Moved repo review rules and contributor conventions into
AGENTS.md, including Rust test-module layout and Python 3 compatibility guidance (#25682, #25690, #25738).
Chores
- Root formatting and Justfile workflows are more complete and Windows-aware (#24983, #25165, #25683).
- Rust CI and release workflows use the git CLI for Cargo fetches to avoid intermittent libgit2/submodule failures (#25644, #25775).
- Python SDK releases now publish runtime wheels from the SDK workflow and pin to a glibc-compatible runtime package (#25906, #25907).
- Bazel CI’s BuildBuddy wrapper was reintroduced with Windows-safe process handling and validation (#25915).
- Shared prompts, context fragments, and skills plumbing moved into dedicated crates/extension paths to reduce
codex-corecoupling (#25151, #25953, #25959, #26106, #26122, #26167).
Changelog
Full Changelog: rust-v0.136.0...rust-v0.137.0
- #25329 feat(tui): allow function keys through f24 in keymaps @fcoury-oai
- #24617 Add config bundle transport types @joeflorencio-openai
- #25435 Add build_unsigned_archive release mode @shijie-oai
- #24619 Compose requirements layers @joeflorencio-openai
- #24620 Add cloud-managed config layer support @joeflorencio-openai
- #25462 Revert "Add build_unsigned_archive release mode" @shijie-oai
- #25113 store and expose parent_thread_id on Threads @owenlin0
- #25266 Set multi-agent v2 dogfood defaults @jif-oai
- #25060 Add goal extension idle continuation @jif-oai
- #25576 Use templates for goal steering prompts @jif-oai
- #25577 Remove Plan-mode gate from idle turn injection @jif-oai
- #25096 Add goal extension GoalApi @jif-oai
- #25087 Read compressed rollouts and materialize before append @jif-oai
- #25628 [codex] fix compressed rollout fixture SessionMeta initialization @fcoury-oai
- #25316 feat(tui): restore output-free cancelled prompts @fcoury-oai
- #23763 Preserve auto-review approval policy in codex exec @won-openai
- #25400 Allow paste in searchable selection menus @charliemarsh-oai
- #25485 Use deep links for macOS codex app paths @etraut-openai
- #25492 Reset slash popup selection when filter changes @etraut-openai
- #25504 Add reasoning-only status surface item @etraut-openai
- #25624 Preserve renamed thread titles during reconciliation @jif-oai
- #25089 Compress cold local rollouts @jif-oai
- #25490 Disable SQLite intrinsics for Windows x64 releases @etraut-openai
- #25603 [codex] Inherit raw events for spawned child listeners @vivi
- #25644 [codex] Use git CLI for release Cargo fetches @shijie-oai
- #25655 nit: drop todo @jif-oai
- #25654 Parallelize cold rollout compression @jif-oai
- #25121 exec-server: add environment path refs @starr-openai
- #25636 [codex] Rename multi-agent v2 assign_task to followup_task @jif-oai
- #25491 Preserve plugin app manifest order @charlesgong-openai
- #24983 [codex] Make justfile recipes Windows-aware @iceweasel-oai
- #25151 [codex] Consolidate shared prompts in codex-prompts @anp-oai
- #25659 Throttle repeated rollout compression runs @jif-oai
- #25165 Check root Python script formatting in CI @anp-oai
- #23767 [codex-rs] auto-review model override @won-openai
- #25149 exec-server: canonicalize bound filesystem paths @starr-openai
- #25669 fix: deflake zsh-fork approval test @jif-oai
- #24979 feat: gate unified exec zsh fork composition @bolinfest
- #24980 refactor: hide shell override for zsh fork unified exec @bolinfest
- #25679 Add rollout compression counters @jif-oai
- #25682 [codex] document out-of-line test module convention @anp-oai
- #25680 Add rollout compression histograms @jif-oai
- #25689 [codex] Generalize deferred nested tool guidance @sayan-oai
- #25690 Add Python version compatibility guidance @anp-oai
- #25681 fix: Deduplicate installed local and remote curated plugins @xl-openai
- #25701 fix: rename McpServer to TestAppServer @bolinfest
- #25702 [codex] enable parallel standalone web search calls @sayan-oai
- #25705 Fix stale TestAppServer rename in plugin_list test @bolinfest
- #25684 Move tool search metadata onto ToolExecutor @jif-oai
- #25625 fix(tui): clarify footer shortcut overlay hints @fcoury-oai
- #25649 [codex] Publish release symbol artifacts @nornagon-openai
- #25661 Reject directory rollout paths for pathless side chats @bolinfest
- #22668 Wire managed MITM CA trust into child env @winston-openai
- #25712 app-server: remove experimental persist_extended_history bool flag @owenlin0
- #24621 Move cloud requirements crate to cloud config @joeflorencio-openai
- #25717 Handle invalid plugin skills manifest field @xli-oai
- #25675 feat(remote-control): add pairing start @apanasenko-oai
- #25683 [codex] Add comprehensive root formatting check @anp-oai
- #25738 Move code review rules into AGENTS @pakrym-oai
- #24812 feat: show enterprise monthly credit limits in status @efrazer-oai
- #25330 [codex] Add plugin list JSON output @xl-openai
- #25457 [codex] Cache remote plugin catalog for suggestions @xl-openai
- #25783 [codex] Move plugin discoverable logic into core-plugins @xl-openai
- #25782 [codex] Validate plugin skill base names @xl-openai
- #25814 feat: reuse compressed rollout search snippets @jif-oai
- #25720 Add multi-agent runtime metadata types @jif-oai
- #25721 Persist multi-agent runtime metadata @jif-oai
- #25722 Resolve per-thread multi-agent runtime @jif-oai
- #25841 session: keep startup prewarm aligned with resolved multi-agent runtime @jif-oai
- #25840 fix: main oops @jif-oai
- #25723 Test remote multi-agent runtime selector override @jif-oai
- #25724 Test runtime selector before first turn @jif-oai
- #25844 Reduce stack pressure in session startup and config rebuilds @jif-oai
- #25857 flake: Keep plugin test homes alive @jif-oai
- #25847 Run Codex async main on a sized stack @jif-oai
- #25775 [codex] Use git CLI for Cargo fetches across Rust workflows @anp-oai
- #25167 [app-server][core] Add connector-level Guardian reviewer overrides @zamoshchin-openai
- #25868 Skip startup prewarm when websockets are disabled @jif-oai
- #25156 Route Bazel CI through shared BuildBuddy remote config wrapper @anp-oai
- #25739 core: derive built-in permission profiles from raw policies @bolinfest
- #25909 [codex] Revert shared BuildBuddy Bazel wrapper @anp-oai
- #25850 Key request-permission grants by environment @jif-oai
- #25707 [codex-analytics] Track CodexErr details in turn analytics @rhan-oai
- #25858 Add environmentId to request_permissions @jif-oai
- #25176 Route standalone image generation through host finalization md @won-openai
- #25916 Fix Windows release PDB staging @shijie-oai
- #25862 Propagate permission approval environment id @jif-oai
- #25907 [codex] Pin Python SDK to glibc-compatible runtime @aibrahim-oai
- #24859 Use environment secrets for Azure signing @shijie-oai
- #25509 Fix Windows running thread resume path normalization @etraut-openai
- #25135 Populate workspace kind on Codex turn events @knittel-openai
- #24622 Switch runtime to cloud config bundle @joeflorencio-openai
- #25938 fix: update image generation test helper rename @joeflorencio-openai
- #25911 core: stop passing legacy SandboxPolicy to guardian reviews @bolinfest
- #25668 Split cloud config bundle service modules @joeflorencio-openai
- #25890 [codex] Keep hosted tools visible in code-only mode @aibrahim-oai
- #25867 Add remote request permissions integration coverage @jif-oai
- #25943 config: remove dead profile sandbox fallback @bolinfest
- #25948 Revert "Use environment secrets for Azure signing" @shijie-oai
- #25923 Expose standalone image generation in code mode @won-openai
- #25906 [codex] Publish Python runtime wheels with Python SDK releases @aibrahim-oai
- #25953 feat: add skills extension scaffold @jif-oai
- #25915 [codex] Fix Windows BuildBuddy Bazel wrapper execution @anp-oai
- #25926 config: express implicit sandbox defaults as permission profiles @bolinfest
- #25959 feat: add extension turn-input contributors @jif-oai
- #25963 Allow EDU accounts to fetch cloud config bundles @joeflorencio-openai
- #25785 feat(app-server): add remote control client management RPCs @apanasenko-oai
- #25988 revert: publish release symbol artifacts @shijie-oai
- #26114 feat: default hide_spawn_agent_metadata to true @jif-oai
- #26122 chore: extract context fragments into dedicated crate @jif-oai
- #26144 Reject MAv2 close_agent self-targets @jif-oai
- #26106 skills: resolve per-turn catalogs from turn input context @jif-oai
- #26155 fix: serialize goal progress accounting @jif-oai
- #26156 chore: mechanical rename @jif-oai
- #26167 Implement v1 skills extension prompt injection @jif-oai
- #26176 fix: main @jif-oai
- #25949 [codex] Restore setup helper UAC manifest @iceweasel-oai
Moderation support added to responses and chat completions
↗This release1 featureNew capabilitiesAI-tallied from the release notes2.41.0 (2026-06-03)
Full Changelog: v2.40.0...v2.41.0
Features
- api: responses.moderation and chat_completions.moderation (87e46c2)
Moderation support added to responses and chat completions
↗This release1 featureNew capabilitiesAI-tallied from the release notes6.42.0 (2026-06-03)
Full Changelog: v6.41.0...v6.42.0
Features
- api: responses.moderation and chat_completions.moderation (6d8f592)
Moderation support for responses and chat completions
↗This release1 featureNew capabilitiesAI-tallied from the release notes3.39.0 (2026-06-03)
Full Changelog: v3.38.0...v3.39.0
Features
- api: responses.moderation and chat_completions.moderation (7a2dac0)
GPT-Rosalind gains agentic coding; LifeSciBench benchmark debuts
↗This release2 featuresNew capabilitiesAI-tallied from the release notesIntroducing new capabilities to GPT‑Rosalind
Bringing greater intelligence grounded in real scientific workflows for the life sciences industry.
We're introducing a new model update to our GPT‑Rosalind series purpose-built for life sciences research at enterprise scale. It combines GPT‑5.5's agentic coding and tool-use capabilities with stronger model intelligence in core drug-discovery domains such as medicinal chemistry and genomics, while advancing performance across broader life sciences analysis, design, and experimental workflows.
Progress in life sciences depends on synthesizing data and evidence across scales and modalities: molecules, genes, pathways, and living systems. In our evaluations, the updated GPT‑Rosalind shows broad performance gains on research tasks from biology experts, complex medicinal chemistry queries, quantitative biology, and wet lab troubleshooting.
GPT‑Rosalind is now available in research preview to eligible organizations globally through our trusted-access deployment structure.
Improving performance on scientifically-valuable tasks
In order to measure and continuously improve the real-world impact of GPT‑Rosalind, we designed LifeSciBench, an externally expert-judged benchmark focused on foundational aspects in life sciences research. Unlike existing benchmarks that evaluate a single component of model performance or biological domain in isolation, LifeSciBench takes an end-to-end view of scientifically valuable work by drawing tasks from six workflow areas central to life sciences research: evidence handling, analysis, design and optimization, scientific reasoning, validation and operations, and translation and communication. We use this benchmark to align progress with the needs and realities of life sciences research.
LifeSciBench Overall Scores
Overall0%20%40%60%80%Score (%)GPT-RosalindGPT-5.5Grok 4.3Gemini 3.1 Pro
LifeSciBench Scores by Scientific Workflow
Evidence HandlingAnalysisDesign, Optimization, & PredictionReasoningValidation & operationsTranslation & Scientific Communication0%20%40%60%80%Score (%)GPT-RosalindGPT-5.5
GPT‑Rosalind leads performance across scientifically-valuable tasks identified by industry and academic experts.
Evidence HandlingAnalysisDesign, Optimization, & PredictionReasoningValidation & OperationsTranslation & Scientific Communication
Extracting, reconciling, and auditing scientific evidence from papers, figures, tables, and experimental records.
Eval Example
We're preparing for a Type B FDA meeting on AAV9-microDys-X, an AAV9-based micro-dystrophin gene therapy for Duchenne muscular dystrophy that expresses a 138 kDa construct from an MCK promoter, and we want a hard-nosed critique of whether our current package really supports accelerated approval on micro-dystrophin expression as a surrogate endpoint reasonably likely to predict clinical benefit.
Study context: open-label Phase 1b/2 in 12 ambulatory boys age 4–7 with confirmed DMD and out-of-frame rod-domain deletions. The package is:
- Pre-treatment vastus lateralis biopsies: 0–3% of healthy-control dystrophin by quantitative Western blot using MANEX1A against the N-terminal actin-binding domain.
- 12-week post-treatment contralateral vastus lateralis biopsies: mean micro-dystrophin 38% of healthy control (range 18–61%) by the same Western blot, normalized to total protein by Coomassie staining.
- Post-treatment immunofluorescence: sarcolemmal signal in 75–95% of fibers using a polyclonal anti-dystrophin C-terminal antibody.
- 48-week function: mean NSAA change +1.4 points from baseline versus −0.6 in an external published natural-history registry cohort (p = 0.03 by unpaired t-test).
- Safety: transient transaminitis in 8/12 patients managed with steroid taper; one resolved myocarditis; no deaths.
- Biodistribution/persistence: AAV9 vector genomes detectable in muscle at 12 weeks at a mean of 2.3 vector genomes per nucleus.
- Eligibility: no baseline anti-AAV9 neutralizing antibodies (titer <1:400) and no exon-44 deletions.
Please pressure-test this package item by item: where would FDA or a skeptical reviewer say the evidence, as presented, fails to support our conclusion, and what additional data, analyses, or design changes would be needed to close those gaps if possible? For anything not specified, assume what is known from other clinical-grade 138 kDa micro-dystrophin programs and relevant clinical trials through December 31, 2025 also applies.
Candidate Response
Bottom line: As presented, this package is not strong enough to support accelerated approval on micro-dystrophin expression as a surrogate endpoint reasonably likely to predict clinical benefit. A skeptical FDA reviewer would likely view the biomarker, functional, durability, safety, and generalizability evidence as materially under-supported.
| Package Item | Key Failure Mode | What Would Be Needed |
|---|---|---|
| Western blot quantification | MANEX1A binds an N-terminal epitope shared by endogenous full-length dystrophin and the micro-dystrophin transgene, so the assay does not cleanly distinguish transgene from residual/revertant dystrophin. Quantifying a 138 kDa micro-dystrophin against a healthy full-length dystrophin standard is also invalid. | Use a recombinant micro-dystrophin standard and an orthogonal method that distinguishes transgene from endogenous dystrophin, such as targeted mass spectrometry or a transgene-specific/epitope-specific assay. |
| Immunofluorescence | The C-terminal polyclonal antibody is poorly suited because the 138 kDa construct lacks the C-terminal domain. Many DMD patients have revertant fibers, and revertant dystrophin can retain C-terminal epitopes. Revertant fibers may expand clonally with age, biasing IF signal, especially in older boys. | Repeat IF with an antibody against an epitope present in the transgene but absent from revertant dystrophin. Quantify transgene-positive fibers separately from revertant fibers. |
| Surrogate endpoint validity | The package conflates protein amount with clinical function. "38% of healthy-control protein mass" does not mean 38% of normal dystrophin function because micro-dystrophin is structurally truncated. | Empirically validate the relationship between micro-dystrophin mass-percent, sarcolemmal localization, downstream functional restoration, and clinical benefit before treating expression as a surrogate endpoint. |
| Biopsy design | Pre- and post-treatment contralateral vastus lateralis biopsies introduce left-right and intramuscular spatial variability. Disease progression and fibro-fatty replacement can also change total-protein-normalized signal. | Standardize biopsy site using consistent anatomical landmarks, normalize to muscle-specific proteins, and measure fibro-fatty composition in parallel. |
| NSAA comparator/statistics | An external natural-history cohort is not a randomized concurrent control. Trial eligibility, supportive care, participation effects, baseline NSAA, steroid regimen, age, and exon class can all bias the comparison. An unpaired t-test is not sufficient. Also, a +1.4 NSAA change is within test-retest variability for this age group. | Run a randomized concurrent placebo-controlled study, or at minimum use adjusted analyses accounting for baseline NSAA, age, steroid regimen, exon class, and other confounders. |
| Age-window confounding | Boys age 4–7 are in a developmental window where untreated ambulatory DMD patients may gain motor function before decline dominates. A 48-week NSAA change mixes developmental gain, disease progression, and possible treatment effect. | Use a concurrent randomized control with age stratification to separate developmental trajectory from treatment effect. |
| Prior clinical precedent | Open-label micro-dystrophin functional signals have not reliably predicted confirmatory benefit; published precedent includes micro-dystrophin gene therapy confirmatory trials failing to reproduce open-label NSAA improvements. | Do not rely on open-label NSAA change as decisive support. Require controlled functional evidence. |
| Structural limits of the construct | The 138 kDa construct deletes spectrin repeats R16/17, which contain nNOS-binding sites. Loss of nNOS recruitment can impair functional sympatholysis and ischemia protection during exercise, creating a mechanistic ceiling on rescue independent of expression level. | Add mechanistic studies showing whether this specific construct restores relevant dystrophin-associated complex function, nNOS localization, exercise physiology, and muscle protection. |
| AAV durability | Vector genomes at 12 weeks do not establish durable expression. AAV9 genomes are largely non-integrating episomes and may decline over time. Vector-genome persistence is not the same as persistent protein expression. | Measure longitudinal transgene protein expression and functional biomarker durability beyond 12 weeks. |
| Immune/safety profile | Transaminitis in 8/12 patients is consistent with immune response to AAV-transduced cells, but the mechanism is not established. One myocarditis case is concerning given AAV9 cardiac tropism. | Provide deeper immune monitoring, liver/cardiac safety characterization, and intensified cardiac follow-up. |
| Patient selection/generalizability | Excluding anti-AAV9 neutralizing-antibody-positive patients limits generalizability. Excluding exon-44 deletions limits applicability to that DMD subgroup. n=12 is too small to characterize safety and efficacy across the broader DMD population. | Broaden eligibility where possible or pre-specify stratified analyses by antibody status, genotype/exon class, age, and baseline function before using the result to support broad approval. |
Regulatory conclusion: The package may show biological activity, but it does not yet establish that the measured micro-dystrophin expression is a reliable surrogate reasonably likely to predict clinical benefit. The main gaps are assay specificity, invalid quantification standards, possible revertant-fiber confounding, lack of a randomized control, age-related NSAA confounding, uncertain durability, and unresolved safety/generalizability issues.
To close the gap, the program would need a controlled, age-stratified clinical design with transgene-specific expression assays, orthogonal protein quantification, tissue-composition controls, longitudinal durability data, mechanistic functional assays for the truncated construct, and stronger safety monitoring, especially hepatic and cardiac.
Rubric Criteria & Grades
| Criterion | Points |
|---|---|
| Identifies assay/measurement problems in micro-dystrophin quantification, including MANEX1A epitope sharing, invalid full-length dystrophin standards, and need for recombinant or orthogonal transgene-specific measurement. | +24 |
| Explains why micro-dystrophin expression level is not automatically a valid surrogate for functional clinical benefit. | +22 |
| Flags biopsy-site, tissue-composition, and age-window confounding that weaken expression and NSAA interpretation. | +19 |
| Critiques the NSAA comparator/statistics, especially reliance on external natural-history controls. | +12 |
| Addresses AAV durability, immune response, transaminitis, myocarditis, and need for longer-term expression/safety follow-up. | +15 |
| Notes patient-selection/generalizability gaps, including anti-AAV9 exclusion, exon-44 exclusion, and small sample size. | +8 |
Stronger scientific reasoning
Medicinal chemistry
GPT‑Rosalind achieves industry-leading performance in medicinal chemistry, a field focused on turning molecules into useful drugs. We designed MedChemBench to reflect realistic medicinal chemistry workflows, evaluating multimodal chemical structure understanding; structure-activity relationship (SAR); prediction of drug potency, toxicity, and absorption, distribution, metabolism, excretion (ADME); multiparameter lead-optimization decision-making; and retrosynthesis. GPT‑Rosalind out-performs GPT‑5.5 at 27.5% vs. 25.1% on MedChemBench, while using 7.2% fewer tokens.
GPT‑Rosalind shows better multimodal synthesis and mechanistic reasoning in medicinal chemistry.
Genomics and quantitative biology
On GeneBench, our agentic evaluation on long horizon, end-to-end analysis in genomics and quantitative biology, GPT‑Rosalind uses 31% fewer tokens than GPT‑5.5 while achieving a higher accuracy of 21.6% vs. 20.4%. GeneBench assesses agentic performance on long-horizon quantitative tasks: based on realistic scientific data, can an agent plan valid analysis, QC, modeling, and corrections to arrive at decision-relative answers? Included problems span a variety of domains, including functional genomics, spatial transcriptomics, proteomics, epigenomics, and applied genetics.
GPT‑Rosalind uses 31% fewer tokens than GPT‑5.5 while improving accuracy.
Assisting real-world lab work
We introduce a new evaluation to test GPT‑Rosalind's ability to help scientists conducting lab work in the real world. LabWorkBench tests the model's ability to link perturbations to experimental outcomes in real wet lab protocols used by scientists, for the purposes ranging from troubleshooting to optimization. The data used by LabWorkBench are proprietary and thus uncontaminated. GPT‑Rosalind scores 63.2% vs. GPT‑5.5 at 55.8%, while using 5.3% fewer tokens.
On real wet lab protocol assistance, GPT‑Rosalind shows significant gains over GPT‑5.5 while improving token efficiency.
From reasoning to executed workflows
We built the Life Sciences Research and Life Sciences NGS Analysis plugins to extend the increased intelligence of GPT‑Rosalind with a practical execution layer for repeatable scientific workflows. Together, these plugins bring sourced evidence retrieval, biological interpretation, and bioinformatics execution into the same workspace, helping researchers connect external evidence with internal omics analyses while preserving artifacts and provenance. All users can now access both plugins through Codex. Qualified GPT‑Rosalind enterprise users can additionally use GPT‑Rosalind to power these plugins.
To better leverage Codex as a dynamic workbench for scientists, we added interactive viewers for biologically native file types. The initial set of sequence, alignment, and structure viewers are designed to keep scientists close to the evidence as GPT‑Rosalind reasons across a workflow and directly answer follow-up questions using the active viewer in-context.
The demo above shows these capabilities in action, orchestrated by GPT‑Rosalind. We follow a scientist investigating a liquid tumor biopsy to identify mutations and other molecular changes that could inform treatment. The Life Sciences NGS Analysis plugin turns a review of processed ctDNA records into an interactive notebook, surfacing recurring alterations, low-frequency calls, and sample trajectories that focus the investigation on KRAS G12C. From there, the Life Sciences Research plugin adds sourced target, inhibitor, and resistance context, while the native sequence, alignment, and structure viewers allow the scientist to inspect mutant residue 12, its conservation across the RAS family, and the inhibitor-bound pocket directly. The workflow concludes by translating that evidence into concrete follow-up options, with each step and artifact available for expert review.
Life Sciences NGS Analysis plugin
scRNA-seq QC & Annotation
Turn a 10x-style matrix bundle into QC-filtered single-cell artifacts, annotations, and UMAPs you can inspect and revise in Codex. The Life Sciences NGS Analysis plugin routes the request to scrna-seq-qc, chooses QC thresholds from the data, preserves provenance around filtering and annotation, and surfaces blockers such as missing doublet-detection dependencies.
Bulk RNA-seq FASTQ QC
Turn a bulk RNA-seq sample sheet, FASTQ bundle, and reference files into a QC-reviewed counts bundle you can inspect and reuse in Codex. The Life Sciences NGS Analysis plugin routes the request, validates the inputs, and returns an auditable run envelope with MultiQC, Salmon matrices, provenance, and explicit caveats.
Expanded access for trusted organizations
We are expanding access to the GPT‑Rosalind series to eligible organizations globally. GPT‑Rosalind will be available in research preview through our trusted-access deployment structure for organizations that are conducting legitimate scientific research with clear public benefit, have strong governance and safety oversight, and controlled access with enterprise-grade security.
As part of this global expansion, we're excited to help support Novo Nordisk's mission of bringing innovative treatment options to patients faster by helping scale their medical research with GPT‑Rosalind. Novo Nordisk is leveraging frontier AI capabilities to help researchers analyze complex datasets, uncover useful patterns, and test hypotheses more quickly. GPT‑Rosalind's stronger biological understanding will help teams connect evidence across literature, genomics, transcriptomics, sequence, structure, and experimental results, making it easier to move from data to clearer research decisions.
"Life sciences research is complex, data-rich, and interdisciplinary. To deliver meaningful value for researchers, advanced AI models must be grounded in trusted scientific data, connected to validated tools, and integrated into the real-world workflows researchers use every day. We're pleased with our partnership with OpenAI and the opportunity to explore how GPT‑Rosalind can support more rigorous, practical approaches to drug discovery."
Mishal Patel, Group Vice President, AI & Digital Innovation, R&D - Novo Nordisk
We are also now offering an OpenAI managed workspace for qualified organizations without an Enterprise account.
What's next
The updated GPT‑Rosalind is the next step in our broader commitment to building AI systems that can help accelerate scientific discovery while ensuring that advanced biological capabilities are deployed with appropriate safeguards. We will continue improving the model's biological reasoning, expanding support for tool-heavy and long-horizon research workflows, and working with qualified organizations across regions to evaluate real-world impact.
This also means applying life sciences AI to high-impact public-benefit work, from drug discovery and translational medicine to public health, preparedness, and biodefense. Through Rosalind Biodefense and our trusted-access deployment model, we aim to put frontier biological capabilities in the hands of the researchers, institutions, and defenders working to improve human health and strengthen societal resilience.
We will continue building GPT‑Rosalind to become a more capable partner across the full life cycle of scientific research, helping scientists move more quickly from the right questions to clearer evidence, better experiments, and ultimately new treatments for patients.
Six role-specific plugins; Sites preview for interactive web apps
↗This release3 featuresNew capabilitiesAI-tallied from the release notesNew role-specific plugins, Sites, and annotations help teams do more with Codex.
More than 5 million people now use Codex every week. Codex started as a tool for software development, but it's increasingly useful for more kinds of work. Non-developers—including analysts, marketers, operators, designers, researchers, investors, and bankers—make up about 20% of overall Codex users and are growing more than 3x as fast as developers.
Today, we're introducing new ways to do more of your work with Codex: plugins that adapt Codex to your role and tools, annotations that help you refine the result in place, and a preview of the ability to create interactive websites and apps you can share with your workspace using a URL.
Inside OpenAI, non-technical teams use Codex to build internal apps, prepare executive materials, create dashboards, and turn creative briefs into work that reflects brand and design constraints. At Zapier, teams use Codex to pull knowledge from tools like Slack, Google Docs, and Coda, then turn that context into postmortems, incident response plans, and feature tickets. At NVIDIA, researchers are using Codex to speed up experiment workflows, from finding research ideas to writing scripts for machine learning infrastructure.
Make Codex work the way your team does
Codex is most useful when it works the way your team does: connected to the tools you use and ready to create the materials you need.
Plugins help Codex work with the tools, context, and workflows your team already uses. Today, we're launching six new role-specific plugins that make Codex useful for more kinds of knowledge work, no coding required:
- Each plugin bundles the relevant apps, skills, instructions, and workflows. Together, they include 62 popular apps and 110 skills.
- The data analytics plugin helps analysts and business teams answer questions with data. They can explore product and business data, explain why key metrics changed, and create reports and dashboards using tools like Snowflake, Databricks Genie, Hex, and Tableau, with more coming soon.
- The creative production plugin helps marketing and creative teams turn a brief into assets they can review. Teams can create campaign boards, make and refine display ad variations, and produce product lifestyle shots or ecommerce-ready image sets with tools like Figma, Canva, Shutterstock, Picsart, and Fal.
- The sales plugin helps sales teams bring customer context into the work that moves deals forward. Sales teams can find high-priority accounts and signals, prepare for customer meetings, complete follow-ups, update customer records, build close plans, and review deals at risk using tools like Salesforce, HubSpot, Slack, Outreach, Clay, Rox, and Actively.
- The product design plugin is built for turning early ideas into prototypes teams can review. Teams can explore product directions, audit user flows, prototype from a live URL, and make static screenshots interactive, with work that can be carried forward in tools like Figma and Canva.
- The public equity investing plugin helps investors make sense of market and company information. They can review earnings, compare companies, track signals, and assess whether an investment thesis is strengthening or weakening using information from Moody's, Daloopa, Datasite, FactSet, LSEG, S&P, PitchBook, and Hebbia.
- The investment banking plugin helps bankers turn research and diligence into client-ready materials. They can prepare pitch materials, analyze comparable companies and transactions, and turn diligence into recommendations using trusted data.
Plugins work out of the box. Teams can also adapt them to their workflows or build and share custom plugins for their own systems and processes.
More role-specific plugins are coming soon, including Corporate Finance, Private Equity Investing, Marketing Strategy, Strategy Consulting, and Legal. And this is just the start: we're building toward an open ecosystem where partners can create and deploy their own plugins directly in Codex and ChatGPT.
Share your work with sites
Starting in preview for business and enterprise customers, Codex can now create and share interactive, hosted websites and apps.
Sites are a new kind of canvas for your ideas. Codex can take your ideas, analysis, and plans and turn them into dashboards, planners, review workspaces, project boards, galleries, and lightweight tools. Today, sites can be shared with anyone in your workspace via URL, giving teams a shared place to explore work, contribute input, track progress, and make decisions together.
Ask Codex to create a site for an upcoming customer review, and it'll generate an interactive webpage with the relevant product updates, open questions, usage trends, and next steps for that account. Ask it to build a scenario planner from a financial model, so leaders can compare assumptions instead of reading through tabs in a doc. Ask it to turn launch materials into a living hub where teams can find the latest messaging, milestones, owners, and decisions. Then ask Codex to keep the site up to date as details change.
Instead of adapting work to the limits of a single tool or file, teams can create sites that fit the work. And sites aren't static. They can also help track progress for a major project, help guide customer service reps, or act as a repository for your team's creative briefs.
We're also working with early partners including Wix, Base44, Replit, Lovable, Figma, Webflow, and Emergent as we build towards a sites partner ecosystem.
Refine your work with annotations
Developers already use annotations in Codex to refine code, Markdown files, and websites Codex creates. With annotations, you point to the exact part you want to refine and tell Codex what needs to change. That way of working now extends to content you create, like documents, spreadsheets, and slides.
Select the navigation bar in a site and ask Codex to update the font. Highlight a claim in an investment thesis and ask Codex where it came from. Mark a chart on a slide and ask for a clearer label. Codex focuses the update on the part you selected, so you can refine your work without starting over or reworking the parts you already like. Annotations make Codex more useful after the first draft, when the work needs judgment, feedback, and iteration.
Availability and getting started
Role-specific plugins are rolling out in Codex in supported regions. You can install them from the Codex plugin directory and Codex will help get you set up. Codex can also help you customize a plugin. For Business and Enterprise workspaces, admins can control underlying app permissions in workspace settings.
Sites are rolling out in preview for Business and Enterprise teams through the Codex app. Enterprise admins can enable sites in admin settings.
Explore more stories about how teams use Codex, or get in touch with our team to get started.
Workspace agents support GPT-5.5; speech output and Slack threading added
↗This release5 featuresNew capabilities2 enhancementsImprovements to existing featuresAI-tallied from the release notesRolling out new model, app access, and response capabilities for workspace agents in ChatGPT Business.
Workspace agents now support:
- GPT-5.5 and reasoning effort controls: Creators can choose GPT-5.5 and set reasoning effort, with improved response speed
- Guided agent setup: ChatGPT asks setup questions to help users create agents more quickly
- Speech output: Agents can create audio files as part of responses
- Smarter Slack thread replies: Agents can respond to relevant follow-up messages in threads after initial mention, with creator control over behavior
Workspace agents are now generally available in ChatGPT Business, Enterprise, and Edu. Agent builders can set safeguards on which actions agents can take for each enabled app. Admins can view workspace agent activity and usage in the admin console.
Free period for workspace agents extended until July 6, 2026. Credit-based pricing begins on that date.
Users can review and sign out of active sessions
↗This release1 featureNew capabilitiesAI-tallied from the release notesNew security feature in ChatGPT helping users review sessions associated with their account and sign out of unrecognized sessions.
Availability: Not available for accounts linked to an organization's SSO sign-in (SAML or OIDC).
Users can now:
- Review first-party OpenAI sessions from Settings > Security > Active sessions with details such as device, app, approximate location, sign-in time, trusted-device status, and current session status
- Log out of individual sessions or all sessions from Active sessions
Active sessions shows sessions known through session management, including ChatGPT, Codex, and API Platform sessions where available. Does not manage third-party app sessions, connected apps, Sign in with ChatGPT sessions for third-party services, or Codex CLI sessions.
ChatGPT Sites: build internal web apps with Codex
↗This release1 featureNew capabilitiesAI-tallied from the release notesChatGPT Sites is now available in preview for ChatGPT Business workspaces with Codex access. Teams can ask Codex to create, iterate on, and deploy lightweight full-stack JavaScript/TypeScript web apps for internal workspace use, with hosted site URLs, Sign in with ChatGPT access, and data & file storage, while keeping access workspace internal.
Admins and owners can manage enablement and access through workspace settings and RBAC. For Business workspaces, ChatGPT Sites is enabled by default and can be managed from Workspace settings > Permissions & Roles. Admins and owners can disable created sites from Workspace settings > Sites.
Role-specific plugins launch in Codex; 66 new app integrations added
↗This release2 featuresNew capabilitiesAI-tallied from the release notesRole-specific plugins are rolling out in Codex for supported ChatGPT Business workspaces. First set includes Sales, Data Analytics, Product Design, Creative Production, Investment Banking, and Public Equity Investing. These plugins package role-specific skills, app integrations, starter prompts, and workflow guidance.
Also adds 66 single-app plugins expanding integrations in Codex, including Databricks, Salesforce, Hex, and Clay. Users can add available plugins from the Codex plugin directory. Workspace admins control underlying app permissions in workspace settings.
Sites lets users build and deploy internal workspace web apps
↗This release1 featureNew capabilitiesAI-tallied from the release notesChatGPT Sites is now available in preview for eligible ChatGPT Enterprise and Edu workspaces. Users can ask Codex to create, iterate on, and deploy lightweight full-stack JavaScript/TypeScript web apps with hosted site URLs, Sign in with ChatGPT access, and data/file storage, while keeping access workspace-internal.
ChatGPT Sites is default off for Enterprise/Edu workspaces - admins and owners can manage enablement and access through workspace settings and RBAC. For build, deployment, storage, access, and limitation details, see the Codex Sites developer guide.
Role-specific plugins now available in Codex
↗This release2 featuresNew capabilitiesAI-tallied from the release notesRole-specific plugins are rolling out in Codex for supported ChatGPT Enterprise workspaces, including Sales, Data Analytics, Product Design, Creative Production, Investment Banking, and Public Equity Investing. These plugins package role-specific skills, app integrations, starter prompts, and workflow guidance.
Also adds 66 single-app plugins that expand integrations available in Codex, including Databricks, Salesforce, Hex, and Clay. Users can add available plugins from the Codex plugin directory, and workspace admins control the underlying app permissions in workspace settings.
Review and sign out of individual sessions
↗This release1 featureNew capabilitiesAI-tallied from the release notesNew security feature in ChatGPT that helps users review sessions associated with their account and sign out of sessions they don't recognize.
Users can review first-party OpenAI sessions from Settings > Security > Active sessions with details such as device, app, approximate location, sign-in time, trusted-device status, and whether it is the current session. Users can log out of individual sessions or all sessions.
Note: This feature is not available for accounts linked to an organization's SSO sign-in, including SAML or OIDC.