AI news you can't miss this week

OpenAI's new GPT-Realtime-2 talks while it thinks, Claude's agents now dream to self-improve, OpenAI's o1 beats doctors in Harvard ER study, GPT-5.5 Instant ships as ChatGPT's new default & OpenAI and Anthropic bet $11.5B on enterprise AI

Best AI news of this week:

1️⃣ OpenAI releases three realtime voice models via API, including GPT-Realtime-2 with 128K context, 70+ language translation, and live transcription


2️⃣ Anthropic ships dreaming, outcomes tracking, and multiagent orchestration for Claude Managed AgentsHarvey, Netflix, and others already deploying


3️⃣ A Harvard study finds OpenAI's o1-preview achieves a 67.1% correct diagnosis rate in ER triage, outperforming two physicians

  • OpenAI releases GPT-5.5 Instant as the new ChatGPT default, cutting hallucinations by 52.5% but raising API pricing up to 2x

  • Anthropic and OpenAI each launch PE-backed joint ventures worth a combined $11.5 billion to deploy AI across enterprise portfolios

WEEKLY AI RECAP

May 4th - May 8th 2026

🎙️ OpenAI's new GPT-Realtime-2 talks while it thinks 🎙️
Three new API models bring GPT-5 reasoning to live voice, with rollout already underway.

OpenAI released three new realtime audio models through its API. GPT-Realtime-2 features a 128K token context window (up from 32K), GPT-5-level reasoning during live speech, multi-tool usage, and the ability to talk while thinking. GPT-Realtime-Translate supports 70+ languages for live multilingual translation, while GPT-Realtime-Whisper handles realtime speech transcription. On Big Bench Audio, Realtime-2 scored 96.6% compared to 81.4% for its predecessor — a massive quality jump that positions OpenAI to own the voice-native AI agent infrastructure market. Companies including Zillow, Priceline, Vimeo, and Deutsche Telekom are already testing the models in production.

💭 Claude's agents now dream to self-improve 💭
Anthropic's Managed Agents analyze past sessions to self-improve, adding outcomes and multiagent orchestration.

Anthropic rolled out major new capabilities for Claude Managed Agents, expanding access to production-grade autonomous agent deployments. The headline feature is "dreaming," where agents analyze past sessions between runs to identify patterns and self-improve over time — a meaningful step toward persistent agent learning. Outcomes tracking lets agents self-correct based on predefined success criteria, creating a built-in feedback loop, while multiagent orchestration enables complex task delegation to specialized subagents. Companies like Harvey, Netflix, Spiral by Every, and Wisedocs are already leveraging these features. Early enterprise adoption signals growing demand for agent infrastructure that learns autonomously.

🏥 OpenAI's o1 beats doctors in Harvard ER study 🏥
A Harvard study reveals AI integration in emergency triage could cut diagnostic delays for critical conditions.

A Harvard study published in Science tested OpenAI's o1-preview model against two physicians across 76 real emergency room cases using only raw electronic health-record text. The AI achieved a correct diagnosis rate of 67.1% at initial ER triage, compared to 55.3% and 50.0% for the two doctors. Independent reviewers could not distinguish AI-generated diagnoses from human ones. In one notable case, the model flagged a rare flesh-eating infection in a transplant patient roughly 12–24 hours before the treating doctor identified it — a capability jump that could meaningfully reduce diagnostic delays for time-critical conditions. Questions remain about real-time clinical workflow integration, but the results carry major implications for emergency medicine.

INTERESTING TO KNOW

🤖 GPT-5.5 Instant ships as ChatGPT's new default 🤖

OpenAI has released GPT-5.5 Instant as the new default model powering ChatGPT, delivering a 52.5% reduction in hallucinations across high-stakes domains like medicine, law, and finance. The model also brings stronger personalization based on user context and more concise responses. On the pricing side, the update comes with a 2x nominal price increase over GPT-5.4, though fewer completion tokens on longer prompts partially offset the jump — real cost increases range from 49% to 92% depending on use case. This sets the pricing bar for consumer AI defaults. The release reflects OpenAI's accelerating model iteration cadence, reinforcing its grip on the consumer AI market while pushing enterprise users to weigh accuracy gains against rising costs.

🤖 OpenAI and Anthropic bet $11.5B on enterprise AI 🤖

Anthropic announced a $1.5 billion joint venture with Blackstone, Goldman Sachs, Hellman & Friedman, and Sequoia to deploy Claude across mid-sized companies. Hours later, OpenAI unveiled "The Deployment Company," a $10 billion joint venture with Bain Capital, Brookfield, TPG, and SoftBank to roll out its tools across their portfolio companies. Both initiatives give frontier AI labs direct commercial paths into enterprises that typically lack in-house talent to deploy AI — shifting competition from model performance to enterprise distribution channels. The parallel announcements mark a major strategic pivot: private equity firms are now the primary go-to-market engine for frontier AI, and the Anthropic-OpenAI rivalry has officially expanded well beyond the lab.

📩 Have questions or feedback? Just reply to this email , we’d love to hear from you!

🔗 Stay connected: