AI Rundown: Google's Deep Think Leads in 2026

5 min read 15.02.2026

Google's Deep Think smashes benchmarks; OpenAI debuts a Cerebras-powered coding model; MiniMax releases M2.5. Key impacts on research, devs, and gaming news.

Image for article - AI Rundown: Google's Deep Think Leads in 2026

Morning AI Rundown: Google Reclaims the Spotlight

Good morning, {{ first_name | AI enthusiasts }}. OpenAI and Anthropic have dominated 2026 headlines so far, but Google just reminded everyone why it remains a powerhouse in AI. An upgraded Deep Think mode shattered benchmarks across math, coding, and science. Google also introduced an autonomous research agent that tackles open problems. Below: the key takeaways, context, and practical implications — plus related tools and community workflows.

AI Rundown: Google's Deep Think Leads in 2026

Google: Deep Think dominates reasoning benchmarks

The Rundown: Google updated Gemini 3's Deep Think reasoning mode and posted dominant scores in math, coding, and science. They also unveiled Aletheia, an Olympiad-level math research agent that autonomously solves and verifies open problems.

Key results

Deep Think: 84.6% on ARC-AGI-2, far ahead of Opus 4.6 (68.8%) and GPT-5.2 (52.9%).
New high of 48.4% on "Humanity's Last Exam."
Gold medals on the 2025 Physics & Chemistry Olympiads.
3,455 Elo on Codeforces, nearly 1,000 points above Opus 4.6.

Aletheia: Google's math agent can propose proofs, verify results, and push domain benchmarks higher. The Deep Think upgrade is live for Google AI Ultra subscribers in the Gemini app, and researchers can request API access via an early access program.

Why it matters: Deep Think's results push scientific and mathematical research tools into new territory. Google's lead in benchmarks signals renewed competitiveness at the frontier of AI research.

OpenAI: GPT-5.3-Codex-Spark — speed on Cerebras chips

The Rundown: OpenAI released GPT-5.3-Codex-Spark, a speed-optimized coding model running on Cerebras hardware. It produces 1,000+ tokens per second and represents OpenAI's first product on chips beyond Nvidia.

Details

Spark favors speed over peak intelligence. It trails full 5.3-Codex on benchmarks like SWE-Bench Pro and Terminal-Bench, but completes tasks much faster.
Part of OpenAI's larger chip diversification: deals with Cerebras, AMD, and Broadcom.
Rollout: research preview for ChatGPT Pro subscribers; API access limited to select enterprise partners.

Why it matters: Real-time coding with instant feedback will change developer workflows for tasks that trade some accuracy for speed. Chip diversification also reduces single-vendor risk.

AI Training: Create a 20-second TV commercial with AI

The Rundown: A practical guide to generate a short TV-style ad using generative tools. This step-by-step reduces trial and error and produces broadcast-quality outputs.

Step-by-step example

Plan: Ask Gemini to outline two 5-second scenes for a 20-second spot.
Frame prompts: Ask Gemini to write start and end-frame prompts for both scenes using photography terms like "Hero shot."
Image generation: In Higgsfield (basic/pro plan), use Image > Create Image > Nano Banana Pro. Set 4K quality, 4 variations, and 21:9 ratio. Generate start/end frames as instructed and download the best images.
Video: In Higgsfield, go to Video > Kling 3.0, upload frames with short scene prompts, and generate clips.
Finalize: Stitch clips in a free editor. Optionally generate music with Suno and Eleven Labs.

Pro tip: Use clear visual language such as "close-up," "establishing shot," or "hero shot" in prompts for predictable results.

MiniMax: M2.5 delivers frontier coding at low cost

The Rundown: Chinese lab MiniMax launched M2.5 — an open-source model that rivals Opus 4.6 and GPT-5 on coding benchmarks while costing far less to run.

Details

M2.5 shows strong coding performance, roughly on par with Opus 4.6 and GPT-5.2 on key tests.
Two APIs: M2.5-Lightning (faster, $2.40 per M output) and standard M2.5 ($1.20 per M output). Compare that with Opus at around $25/M.
MiniMax reports that M2.5 handles 30% of daily company tasks across departments and 80% of new code commits internally.
APIs are available now; open-source weights and license details are pending.

Why it matters: Lower-cost, high-performing models like M2.5 change the economics of running always-on agents. This makes agentic automation and continuous integration with AI more affordable.

Other noteworthy AI developments

ByteDance launched Seedance 2.0, a state-of-the-art video model; access remains restricted.
Mustafa Suleyman told the FT that much white-collar work could be fully automated within 12–18 months.
Elon Musk described recent xAI departures as a reorg for "speed of execution."
OpenAI is retiring GPT-4o, GPT-4.1, and o4-mini from ChatGPT.
Anthropic announced a $30B funding round at a $380B valuation and a $14B revenue run rate; Claude Code contributes $2.5B.
An OpenAI researcher resigned citing concerns about manipulation risks from large archives of human-generated content.

Trending AI tools and offers

Incogni — removes personal data from the web. Use code RUNDOWN for 55% off.
M2.5 — MiniMax's new open-source model with strong coding capabilities.

Community workflow: QR check-in system

Each newsletter highlights a reader workflow. This issue features Anthony H. from Australia, who built a lightweight QR check-in system for iPad using Google AI Studio, GitHub, and Vercel.

Features

Event session creation and member profiles
Auto-generated custom QR codes per member
Local data storage for privacy plus backup
Bulk import/export and reporting for funding requirements

Want to share your workflow? Tell us here.

Gaming news note

Related gaming news: AI advances—especially faster coding models and cheaper open-source weights—accelerate tools for game development. Expect quicker prototyping, automated QA agents, and smarter NPC behaviors enabled by these models.

Closing

Google's Deep Think upgrade, OpenAI's speed-first Codex Spark, and MiniMax's M2.5 all signal rapid shifts in capability and cost. These changes will affect research labs, developer workflows, and production systems — and they'll impact adjacent industries like gaming and media more quickly than many expect.

See you soon,
Rowan, Joey, Zach, Shubham, and Jennifer — the humans behind The Rundown

AI Rundown: Google's Deep Think Leads in 2026

Morning AI Rundown: Google Reclaims the Spotlight

Top stories

Google: Deep Think dominates reasoning benchmarks

Key results

OpenAI: GPT-5.3-Codex-Spark — speed on Cerebras chips

Details

AI Training: Create a 20-second TV commercial with AI

Step-by-step example

MiniMax: M2.5 delivers frontier coding at low cost

Details

Other noteworthy AI developments

Trending AI tools and offers

Community workflow: QR check-in system

Features

Gaming news note

Closing

Comments

Add Comment