AI news you can't miss this week

OpenAI preps GPT-5 with o3 reasoning for August, GitHub's Spark turns ideas into apps, DeepMind's Gemini wins IMO gold, Unitree ships $5,900 humanoid robot, and Anthropic discovers AI's thinking paradox

Best AI tools of this week:

1️⃣OpenAI's flagship model receives major reasoning upgrade ahead of August launch.

2️⃣GitHub launches Spark - natural language to full-stack apps in minutes for Copilot Pro+ users


3️⃣ Anthropic discovers AI thinking paradox where longer reasoning leads to worse performance

  • Google DeepMind's Gemini Deep Think achieves IMO gold medal with 35/42 points, solving complex math problems in natural language.

  • Unitree's $5,900 humanoid robot challenges Tesla's pricing.

WEEKLY AI RECAP / 22 - 26 July 2025

🚀 GPT-5 arrives next month with advanced reasoning 🚀
August release will feature integrated o3 capabilities

OpenAI is preparing to release GPT-5 in early August, marking the company's most significant model upgrade since GPT-4. The new model integrates o3 reasoning capabilities directly into the core system, enabling more sophisticated multi-step problem solving. CEO Sam Altman described testing GPT-5 on complex questions he couldn't answer himself, calling the experience "strange" as the AI responded perfectly while he felt "useless compared to the AI." The launch will include mini and nano versions for API deployments, with Microsoft expanding server capacity to handle the anticipated demand surge.

💻 GitHub launches AI-powered app builder Spark 💻
Copilot Pro+ users can now build full-stack apps in minutes

GitHub just launched Spark in public preview, a revolutionary tool that lets developers build and deploy full-stack AI apps using nothing but natural language descriptions. Spark uses Anthropic's Claude Sonnet 4 model to automatically generate complete applications with frontend, backend, and hosting included. The platform is available exclusively to Copilot Pro+ subscribers and represents a major leap in the "vibe coding" trend reshaping software development.

Microsoft CEO Satya Nadella described Spark as "a new tool in Copilot that turns your ideas into full-stack apps, entirely in natural language." The platform requires no setup, allowing users to go from concept to deployed application in minutes. This positions GitHub to compete directly with no-code platforms like Lovable, which recently crossed $100M ARR in just eight months.

The launch comes as Microsoft doubles down on AI-powered development tools, with 150 million+ programmers already using GitHub. Spark's "Dream it. See it. Ship it." tagline reflects the company's ambition to democratize app development and capture the growing market of citizen developers.

🧠 Anthropic finds AI thinking paradox🧠
Longer reasoning seems to lead to worse performance

Anthropic researchers uncovered a counterintuitive AI phenomenon: giving models more time to "think" often makes them perform worse, not better. Their study reveals that Claude, GPT-4, and other leading models frequently struggle with distraction and overfitting when given extended reasoning time, leading to accuracy declines on various tasks.

This directly challenges the widespread assumption that more processing time equals better outcomes - a principle driving the development of reasoning models like OpenAI's o1 series. The findings suggest that current RLHF training methods may be creating models that overthink problems rather than solving them efficiently.

This research could fundamentally reshape how AI companies approach model training and inference optimization, potentially explaining why some "faster" models outperform their "deeper thinking" counterparts.

INTERESTING TO KNOW

🏆 DeepMind's Gemini Deep Think achieves IMO Gold Medal 🏆

Google DeepMind achieved a major breakthrough with Gemini Deep Think earning gold-medal performance at the International Mathematical Olympiad (IMO) 2025, solving 5 out of 6 problems perfectly and scoring 35 out of 42 points.

This represents a significant advance over last year's silver-medal performance by AlphaProof and AlphaGeometry 2 systems that scored 28 points. Unlike previous systems requiring expert translation into domain-specific languages and taking 2-3 days of computation, this year's Gemini model operated end-to-end in natural language, producing rigorous mathematical proofs directly from official problem descriptions within the 4.5-hour competition time limit.

The achievement utilized enhanced reasoning mode with parallel thinking techniques, enabling the model to simultaneously explore multiple solutions before providing final answers. IMO President Prof. Dr. Gregor Dolinar confirmed solutions were "clear, precise and easy to follow." Google DeepMind trained this version using novel reinforcement learning techniques with multi-step reasoning and curated high-quality mathematical solutions.

🤖 China's Unitree drops humanoid robot under $6K 🤖 

Unitree Robotics unveiled its R1 humanoid robot for just $5,900, dramatically undercutting American competitors in the race to bring multipurpose humanoids to the masses. While rivals like Tesla's Optimus ($20K) and Figure's 02 ($50K) discuss future affordability, the Chinese startup is shipping now.

The 25kg machine features 26 joints and multimodal AI with voice and image recognition, positioning itself as one of the world's first sub-$6,000 humanoid robots. The timing is strategic as Unitree recently filed for what could be the first humanoid robotics IPO on a Chinese exchange, signaling China's aggressive push to dominate the emerging robotics market.

📩 Have questions or feedback? Just reply to this email , we’d love to hear from you!

🔗 Stay connected: