OpenAI ships Jalapeño, its first custom inference chip

PLUS: Gemini 3.5 Flash gets computer-use capabilities & Fable 5 resurfaces as export ban faces lawsuit. GPT-5.5 Instant rolls out conversational upgrade, NVIDIA ships NeMo AutoModel on Hugging Face.

1️⃣

OpenAI and Broadcom unveil Jalapeño, a purpose-built inference chip to cut Nvidia reliance and power 10 GW of compute by 2029

2️⃣ Google adds native computer-use capabilities to Gemini 3.5 Flash, enabling desktop automation through screenshots, clicks, and typing

3️⃣ Anthropic's Fable 5 references surface in Claude Code and Amazon Bedrock as legal challenges mount against the U.S. export ban

  • OpenAI rolls out an upgraded GPT-5.5 Instant with stronger conversational skills for both paid and free-tier users

  • NVIDIA launches NeMo AutoModel on Hugging Face, delivering up to 3.7× training throughput gains for large MoE models

MAIN AI UPDATES / 25th June 2026

🔧 OpenAI ships Jalapeño, its first custom inference chip 🔧
OpenAI cuts Nvidia reliance with a purpose-built chip reshaping compute pricing.

OpenAI, in partnership with Broadcom, has shipped Jalapeño — its first custom inference chip built to run ChatGPT, Codex, and future AI agents more efficiently than general-purpose GPUs. Developed in just nine months with AI-assisted design, the chip delivers industry-leading performance per watt. Reducing Nvidia dependence reshapes long-term AI compute pricing — OpenAI aims to power 10 GW of compute with custom silicon by 2029, though Nvidia will continue to anchor model training. The move reflects a broader trend of AI companies vertically integrating their hardware stack, signaling competitive pressure across the chip supply chain.

🖥️ Gemini 3.5 Flash gets computer-use capabilities 🖥️
Google rolls out desktop automation that directly rivals Anthropic's Claude computer-use.

Google has added native computer-use capabilities to Gemini 3.5 Flash, enabling the model to navigate desktop interfaces through continuous screenshots — performing clicks, scrolls, typing, and application interactions. This lightweight model processes visual inputs to execute real-world computer tasks across varied software environments. The rollout significantly lowers the barrier for developers building AI agents that automate complex desktop workflows, intensifying competitive pressure against Anthropic's Claude. The capability marks a major step toward AI agents that operate autonomously across arbitrary software interfaces, positioning Gemini as a serious contender in the practical agentic AI space.

⚖️ Fable 5 resurfaces as export ban faces lawsuit ⚖️
Regulation tensions mount as Fable 5 references surface and lawsuits challenge access restrictions.

Anthropic's Fable and Mythos models remain offline under a U.S. government export order, but multiple signals suggest their public return may be approaching. New references to Fable 5 have been discovered in Claude Code, and the Trump administration is reportedly in more productive talks with Anthropic. Legal-tech firm Legion has filed the first lawsuit against the order, calling it "unlawful," while bipartisan members of Congress have given the Commerce Department until June 26 to explain how access could resume. Fable 5 has also reportedly resurfaced in Amazon Bedrock. Export controls risk limiting frontier model distribution globally, underscoring the tension between AI safety policy and commercial availability.

INTERESTING TO KNOW

💬 GPT-5.5 Instant rolls out conversational upgrade 💬

OpenAI has begun the rollout of an upgraded GPT-5.5 Instant across ChatGPT, now available to both paid and free-tier users. The update delivers stronger conversational abilities, including improved intent recognition, better context retention across long conversations, and more reliable instruction following. The model also adds improved shopping and local recommendations. Expanding free-tier access strengthens ChatGPT's adoption lead, as OpenAI continues its strategy of continuous refinement across its entire model lineup.

⚡ NVIDIA ships NeMo AutoModel on Hugging Face ⚡

NVIDIA launched NeMo AutoModel on Hugging Face, a framework designed to accelerate fine-tuning integration for massive Mixture-of-Experts architectures like Qwen3 and DeepSeek V3. Benchmarks show up to a 3.7× speed increase in training throughput alongside a 32% reduction in peak GPU memory usage compared to native Transformers v5 libraries. This release offers developers critical infrastructure to efficiently fine-tune the largest open-source models on NVIDIA hardware.

📩 Have questions or feedback? Just reply to this email , we’d love to hear from you!

🔗 Stay connected: