Google Just Reminded Everyone It's Still in the Fight

Not every week gives us something worth stopping for. This one did.

Google upgraded Gemini 3 Deep Think and posted scores I had to go back and verify. OpenAI shipped a coding model built entirely around speed. And a Y Combinator creator made a serious argument about the future of software that deserves more attention than it got. Here is the short version of all three.

What Google Just Pulled Off

I was not expecting this kind of margin.

Gemini 3 Deep Think — Benchmark Scores

ARC-AGI 2 — Reasoning Gap Visualized

Gemini 3 Deep Think 84.6%
Opus 4.6 68.8%
GPT 5.2 52.9%

Source: ARC-AGI 2 Benchmark · February 2026

ModelARC-AGI 2 ScoreGemini 3 Deep Think84.6%Opus 4.668.8%GPT 5.252.9%

Also from this update:

Gold-medal performance at the 2025 Physics and Chemistry Olympiads
3,455 Elo on Codeforces
Aletheia — a new math research agent that works on open problems and checks proofs

ARC-AGI 2 was built to resist pattern memorization. Scoring 84.6% here means actually reasoning through novel problems — not recalling answers. A 30-plus point gap over GPT 5.2 is not a rounding error. That is a real capability difference, and Google earned it.

Aletheia is the part of this update that got the least attention and probably deserves the most. A dedicated agent that can work on unsolved math problems and verify proofs is not a consumer feature. It is a research tool. Google is clearly building for a different tier of user.

OpenAI's Bet This Week Was Speed, Not Depth

Google

Going Deep

Reasoning, research tools, advanced math, long-horizon problem solving

OpenAI

Going Fast

Speed, live coding, hardware independence, wider accessibility

GPT 5.3 Codex Spark — Fast Facts

Runs on Cerebras chips. Over 1,000 tokens per second. Backed by a $10B+ deal. AMD and Broadcom now in the hardware mix too. Built for quick edits, live coding, and instant feedback.

This is not OpenAI falling behind. It is a deliberate product decision for a real use case. Developers doing live coding do not need the deepest model — they need one that keeps up with how they think. Codex Spark does that job well.

The more important story is the hardware shift. OpenAI is actively building outside of Nvidia through AMD, Broadcom, and Cerebras. That is a long-term cost and supply chain play that will show up in pricing over time.

Two companies, two very different directions. Google is going deep. OpenAI is going fast. Both are rational. Which one builders choose to work on top of is the question that will matter most over the next year.

The App Prediction Worth Sitting With

The OpenClaw creator argued through Y Combinator this week that roughly 80% of traditional apps will eventually disappear. The reasoning is simple and hard to dismiss.

Apps were designed around what software could do a decade ago — forms, menus, dashboards. If the interface can now understand a plain request and complete the task directly, the app layer becomes a detour. Not for everything. But for a lot.

This is not a next-quarter prediction. It is a slow shift that tends to arrive quietly and then all at once. If you build products or work with teams that do, it is worth thinking about now.

A Line Worth Carrying Into the Weekend

"The expert in anything was once a beginner who refused to quit."

Wherever you are in the learning curve right now — keep going. Everyone you admire started exactly where you are.

This week made one thing clear. The gap between the top players is real, it is widening in some places, and the companies that come out ahead will be the ones that made the right bets on depth, speed, and infrastructure before it was obvious which one mattered more.

Watch what gets built on top of these platforms over the next few months. That is where the real answer will show up.

Just hit reply and let me know. I read every response and it genuinely shapes what I cover next.

This Week's Question

Which storyline from this week do you find most interesting?

A — Google's reasoning gains with Gemini 3
B — OpenAI's speed play with Codex Spark
C — The future of apps argument

Just reply with A, B, or C. Takes 5 seconds and helps me cover what matters to you.

Talk soon,

The Daily Upgrade

Keep Reading