- H-FARM AI's Newsletter
- Posts
- OpenAI's o1 beats doctors in Harvard ER study
OpenAI's o1 beats doctors in Harvard ER study
PLUS: Anthropic's Jupiter-V1-P readies for Code with Claude showdown & Codex ships animated Pets and config imports. Google tests Omni video model ahead of I/O, 8 AI firms join Pentagon's classified networks.

In today’s agenda: 2️⃣ Anthropic is red-teaming a new frontier build called Jupiter-V1-P, likely headed for its May 6 Code with Claude developer conference 3️⃣ OpenAI ships animated Pets and automatic config imports from rival coding agents to its Codex desktop app |
|
MAIN AI UPDATES / 4th May 2026
🏥 OpenAI's o1 beats doctors in Harvard ER study 🏥
A Harvard study reveals AI integration in emergency triage could cut diagnostic delays for critical conditions.
A Harvard study published in Science tested OpenAI's o1-preview model against two physicians across 76 real emergency room cases using only raw electronic health-record text. The AI achieved a correct diagnosis rate of 67.1% at initial ER triage, compared to 55.3% and 50.0% for the two doctors. Independent reviewers could not distinguish AI-generated diagnoses from human ones. In one notable case, the model flagged a rare flesh-eating infection in a transplant patient roughly 12–24 hours before the treating doctor identified it — a capability jump that could meaningfully reduce diagnostic delays for time-critical conditions. Questions remain about real-time clinical workflow integration, but the results carry major implications for emergency medicine.
🤖 Anthropic's Jupiter-V1-P readies for Code with Claude showdown 🤖
Anthropic's next frontier model is undergoing safety probes ahead of its rollout at Code with Claude.
Anthropic has begun red-teaming a new internal build called Jupiter-V1-P, a frontier-class model being prepared for potential announcement at its Code with Claude developer conference on May 6 in San Francisco. The red-teaming phase aligns with Anthropic's Responsible Scaling Policy, which mandates jailbreak probes and constitutional classifier stress tests before any frontier deployment. The timing strongly suggests Jupiter-V1-P could be unveiled as Anthropic's next major model release — this matters because it directly reshapes competitive pressure among OpenAI, Google, and Anthropic. The development signals Anthropic's continued commitment to safety-first deployment while pushing capability boundaries at the frontier.
🛠️ Codex ships animated Pets and config imports 🛠️
OpenAI's Codex desktop app rolls out features designed to reduce migration friction from rival tools.
OpenAI shipped several usability updates to its Codex desktop application, including animated "Pets" — screen overlays that interact via message bubbles, letting developers track agent progress without switching windows. More notably, Codex now auto-imports configuration files from competing coding agents like Cursor and Replit, lowering the barrier to switch. A new dictation dictionary also improves voice input accuracy for technical terms. The config import feature is a direct competitive move that lowers switching costs for developers already invested in rival tools. These changes reflect OpenAI's strategy of differentiating through developer experience rather than raw model performance alone.
INTERESTING TO KNOW
🎬 Google tests Omni video model ahead of I/O 🎬
Google is internally testing a new "Omni" model for video generation, which has surfaced in Gemini's video generation UI — suggesting a rollout could come as early as Google I/O. The model may unify Google's currently fragmented video and image-generation tools under a single architecture, placing it squarely against OpenAI's Sora, Runway, and other competitors. If confirmed, this consolidation would signal Google's intent to compete aggressively in multimodal content creation.
🛡️ 8 AI firms join Pentagon's classified networks 🛡️
The Pentagon has granted access to its classified networks to eight AI companies — SpaceX, OpenAI, Google, Nvidia, Reflection, Microsoft, AWS, and Oracle — for defense integration work, while notably excluding Anthropic, which retains a supply-chain risk label. DoD CTO Emil Michael acknowledged Anthropic's Mythos model as a "separate national security moment" but maintained the exclusion. The contracts reportedly carry the same autonomous-weapons and surveillance restrictions that originally led to Anthropic's blacklisting — a regulation signal worth watching as frontier AI deepens its ties to defense.

📩 Have questions or feedback? Just reply to this email , we’d love to hear from you!
🔗 Stay connected:
