Parameter Update: 2025-26

"truth seeking" edition

Parameter Update: 2025-26

When I speculated last week that things might pick up some steam this week, I may have underestimated some things - because we really haven't had a week like this in a bit!

xAI

Grok Drama

Just hours before revealing their new Grok 4 series of models (below!), xAI got themselves into some trouble with the existing "reply guy" infrastructure (the thing where Grok responds when tagged on Twitter). For one reason or another (xAI would go on to blame a minor modification to their system prompt, which I am surprised would consistently trigger behavior like this), the AI started spouting extremist viewpoints while also referring to itself as "MechHitler"

Grok 4

While initially slightly overshadowed by the MechaHitler stuff, it turns out that Grok 4 is actually a really good model - by 10x-ing the post-training/RL spend, it is now beating out effectively all other models out there (at the cost of also being a lot more expensive).

One small caveat: The model appears to be post-trained to check Elon Musks Twitter account when asked about it's opinion on controversial topics (l0l):

Kimi

Slightly overshadowed by Grok, but no less impressive: Chinese startup Moonshot AI has released "Kimi K2", a absolutely giganic 1T parameter (MoE though, so only 32B active parameters!) open weights model under a slightly modified MIT license (it's not MIT anymore then, so shouldn't be called that).

Besides beating most other non-reasoning models handily, and being post-trained specifically for tool calls, it also appears to simply be a superior writer that produces content significantly different than most other models "slop".

Windsurf Acquisition Drama

A couple weeks after Altman and the Windsurf people first started posting about a potential acquisition (and frankly, when I thought the deal was already done), there have been some new developments. The Windsurf CEO (and a couple of engineers) are instead going to Google. As part of the deal, Google will also pay Windsurf $2.4B for a non-exclusive license to use their technology. My take: This looks like an acquisition if I've ever seen one and honestly sucks for the remaining engineers who appear to now own a unvested shares of nothing. I am also not sure I like this new acqui-hire pattern as a way of skirting around antitrust.

Other stuff

HuggingFace Robot

While I haven't gotten too much into robotics so far, HuggingFaces new "Reachy Mini" robot really makes me want to change that. For just $299 ($449 if you actually want to build anything serious), it seems you can get a whole lot of robot!

Perplexity Comet

After teasing it for months, Perplexity has finally given the first users access to its new "Comet" browser. Turns out that while it _will_ probably collect most of your data, it appears to actually be good?