ARC-AGI-3: a new benchmark that exposes AI’s biggest weakness

PLUS: Google ships Lyria 3 Pro with 3-minute tracks & Meta's Manus founders blocked from leaving China. Google Ads rolls out Veo for video creation, Reflection AI eyes $25B valuation.

In today’s agenda:

1️⃣ The new ARC-AGI-3 benchmark humbles every frontier model — GPT-5.4, Gemini 3.1 Pro, Claude Opus 4.6, and Grok 4.2 all score below 0.4% while humans hit 100%

2️⃣ Google launches Lyria 3 Pro, its upgraded AI music model supporting full 3-minute tracks across Gemini, Vertex AI, and Google Vids

3️⃣ Manus co-founders told not to leave China while authorities review the $2.5B Meta acquisition, raising cross-border AI deal concerns

  • Google Ads integrates the Veo video generation model directly into its ad platform for on-demand video creation

  • Nvidia-backed Reflection AI seeks $2.5B in funding at a $25B valuation, positioning itself as the "DeepSeek of the West"

MAIN AI UPDATES / 26th March 2026

🤖 ARC-AGI-3: a new benchmark that exposes AI’s biggest weakness 🤖
A new reasoning benchmark resets the AI scoreboard, exposing the speed gap between humans and models.

The ARC Prize Foundation released ARC-AGI-3, an interactive reasoning benchmark designed to evaluate agentic intelligence — and the results are sobering. Gemini 3.1 Pro scored 0.37%, GPT-5.4 hit 0.26%, Claude Opus 4.6 managed 0.25%, and Grok 4.2 scored a flat 0%. Meanwhile, humans solved 100% of environments on their first attempt with no prior training. Labs had spent millions training on ARC-AGI-2, pushing scores from 3% to roughly 50% in under a year, but this new version resets the scoreboard entirely. This exposes the persistent gap between pattern-matching and genuine reasoning. A $1 million prize backs the challenge.

🎵 Google upgraded Lyria 3 Pro with 3-minute tracks 🎵
Google's rollout of Lyria 3 Pro embeds AI music generation across its entire product ecosystem.

Google upgraded its music AI model to Lyria 3 Pro, which now supports tracks up to three minutes long with full song structure — intros, verses, and choruses included. The model adds finer control over song structure and customization options. Lyria 3 Pro is rolling out across Gemini, Vertex AI, and Google Vids, signaling competitive pressure against emerging AI music startups. This positions Google as a leading player in AI-generated audio for both consumers and enterprise users, embedding generative music capabilities deep into its product stack.

🌐 Meta's Manus founders blocked from leaving China 🌐
Chinese regulation blocks Manus co-founders from traveling as authorities review Meta's $2.5B acquisition.

Manus co-founders Xiao Hong and Ji Yichao have been told not to leave China while authorities review the company's $2.5 billion acquisition by Meta. Early versions of Manus were built by engineers from a Chinese company, before a Singapore-based entity took over operations and relocated most China-based employees to Singapore — enabling the Meta purchase. Chinese authorities are concerned this structure could serve as a template for other domestic AI firms looking to sell to Western buyers and bypass oversight. This could reshape how US-China AI acquisitions are structured going forward, adding geopolitical complexity to future cross-border deals.

INTERESTING TO KNOW

📱 Google Ads rolls out Veo for video creation 📱

Google's integration of its Veo video generation model directly into the Google Ads platform lets advertisers create AI-powered video content without leaving the ad creation workflow. This lowers the barrier to video ad production for millions of businesses, reducing reliance on external creative agencies. Advertisers can now generate video assets on demand, streamlining campaign creation and signaling Google's strategy to monetize generative AI through its dominant advertising ecosystem.

💰 Reflection AI eyes $25B valuation with Nvidia backing 💰

The pricing of Reflection AI's fundraise at a $25 billion valuation signals intense investor appetite for open US-based AI systems. The Nvidia-backed startup is seeking to raise $2.5 billion, with JPMorgan reportedly among potential participants. Described as the "DeepSeek of the West," the company aims to build freely available American AI models, with national security considerations increasingly driving investment decisions in the space.

📩 Have questions or feedback? Just reply to this email , we’d love to hear from you!

🔗 Stay connected: