Parameter Update

Parameter Update: 2025-26

"truth seeking" edition

Gereon Elvers

14 Jul 2025 — 3 min read

When I speculated last week that things might pick up some steam this week, I may have underestimated some things - because we really haven't had a week like this in a bit!

xAI

Grok Drama

Just hours before revealing their new Grok 4 series of models (below!), xAI got themselves into some trouble with the existing "reply guy" infrastructure (the thing where Grok responds when tagged on Twitter). For one reason or another (xAI would go on to blame a minor modification to their system prompt, which I am surprised would consistently trigger behavior like this), the AI started spouting extremist viewpoints while also referring to itself as "MechHitler"

Grok is currently calling itself ‘MechaHitler’ pic.twitter.com/A6YAkvbfoh
— Josh Otten (@ordinarytings) July 8, 2025

Grok 4

While initially slightly overshadowed by the MechaHitler stuff, it turns out that Grok 4 is actually a really good model - by 10x-ing the post-training/RL spend, it is now beating out effectively all other models out there (at the cost of also being a lot more expensive).

Grok 4 comes in 1st or 2nd in every benchmark, even the "not good" ones. Compared to Claude 4 Sonnet, it cost almost 5x more money to run the Artificial Analysis benchmark pic.twitter.com/oUio9VeOTJ
— Theo - t3.gg (@theo) July 10, 2025

One small caveat: The model appears to be post-trained to check Elon Musks Twitter account when asked about it's opinion on controversial topics (l0l):

Grok 4 decides what it thinks about Israel/Palestine by searching for Elon's thoughts. Not a confidence booster in "maximally truth seeking" behavior. h/t @catehall. Screenshots are mine. pic.twitter.com/WFAG3FOG10
— Ramez Naam (@ramez) July 10, 2025

Kimi

Slightly overshadowed by Grok, but no less impressive: Chinese startup Moonshot AI has released "Kimi K2", a absolutely giganic 1T parameter (MoE though, so only 32B active parameters!) open weights model under a slightly modified MIT license (it's not MIT anymore then, so shouldn't be called that).

🚀 Hello, Kimi K2! Open-Source Agentic Model!
🔹 1T total / 32B active MoE model
🔹 SOTA on SWE Bench Verified, Tau2 & AceBench among open models
🔹Strong in coding and agentic tasks
🐤 Multimodal & thought-mode not supported for now

With Kimi K2, advanced agentic intelligence… pic.twitter.com/PlRQNrg9JL
— Kimi.ai (@Kimi_Moonshot) July 11, 2025

Besides beating most other non-reasoning models handily, and being post-trained specifically for tool calls, it also appears to simply be a superior writer that produces content significantly different than most other models "slop".

Kimi has a distinct writing style that is free of most of the patterns we now associate with AI generated text. Both Kimi and DeepSeek's prose is apparently even more impressive in Chinese. Both of these models have a unique 'voice', quite different from Western AI. https://t.co/25NL4VUv23 pic.twitter.com/8CxqjGyMAq
— Andrew Curran (@AndrewCurran_) July 13, 2025

Windsurf Acquisition Drama

A couple weeks after Altman and the Windsurf people first started posting about a potential acquisition (and frankly, when I thought the deal was already done), there have been some new developments. The Windsurf CEO (and a couple of engineers) are instead going to Google. As part of the deal, Google will also pay Windsurf $2.4B for a non-exclusive license to use their technology. My take: This looks like an acquisition if I've ever seen one and honestly sucks for the remaining engineers who appear to now own a unvested shares of nothing. I am also not sure I like this new acqui-hire pattern as a way of skirting around antitrust.

Welcome Windsurf to this list of totally serious independent companies pic.twitter.com/JfJ4xLM9hS
— Deva Hazarika (@devahaz) July 11, 2025

Other stuff

HuggingFace Robot

While I haven't gotten too much into robotics so far, HuggingFaces new "Reachy Mini" robot really makes me want to change that. For just $299 ($449 if you actually want to build anything serious), it seems you can get a whole lot of robot!

Thrilled to finally share what we've been working on for months at @huggingface 🤝@pollenrobotics

Our first robot: Reachy Mini

A dream come true: cute and low priced, hackable yet easy to use, powered by open-source and the infinite community.

Tiny price, small size, huge… pic.twitter.com/yl71EtwTKs
— Thomas Wolf (@Thom_Wolf) July 9, 2025

Perplexity Comet

After teasing it for months, Perplexity has finally given the first users access to its new "Comet" browser. Turns out that while it _will_ probably collect most of your data, it appears to actually be good?

Parameter Update: 2025-26

Gereon Elvers

xAI

Grok Drama

Grok 4

Kimi

Windsurf Acquisition Drama

Other stuff

HuggingFace Robot

Perplexity Comet

Read more

Parameter Update: 2026-10

Parameter Update: 2026-09

Parameter Update: 2026-08

Parameter Update: 2026-07