Parameter Update: 2025-18

"groking" edition

Parameter Update: 2025-18

Apart from some self-inflicted wounds on the Grok team, it seems that we've had one of the slower news weeks of the year, but with I/O next week and Anthropic and OpenAI rumours flying around, that is unlikely to be the case too much longer!

Grok: Rogue employee modifying system prompt (again)

After we've just recovered from the OpenAI sycophancy debate, xAI seemed hellbent on pulling us back in for more discourse, in the most stupid way imaginable: Having Grok constantly bring up Elon's favorite talking point of the week, White Genocide in South Africa. In basically every response.

After immediate blowback, xAI's defence amounted to the same excuse they pulled last time: An unauthorized change made by a rogue employee at 3:15 AM.

While I personally agree with most Twitter user that this was very clearly (and embarrasingly) Elon's doing, it once again highlights the primary theme I brought up last week during the OpenAI drama: Lack of accountability.

While this attempt at manipulation was stupid and easily caught, that may not always be the case. And as such, even xAI's attempt to spin this controversy into something positive by publishing their current System Prompt moving forward so they could be held accountable, seems poorly thought out given

OpenAI: Codex

In less controversial news, OpenAI has finally unveiled their long-awaited Software Engineering Agent, called... Codex (lol)

Naming aside, the setup actually seems really cool: A fine-tuned variant of o3 get's to work in a sandbox environment linked to an existing GitHub repo and, once it has finished its work, opens up a pull request. As a user, all you have to do is launch 2-3 of these agents in parallel, pick the best results and merge.

Coming back to names for a second, that also means, though, that there are now a total of five things named Codex released by OpenAI (four of which were released in the past few weeks):

  • 2021: "Codex" - fine-tuned GPT-3 for Coding
  • April 2025: Codex CLI (Claude Code but Open Source)
  • May 2025:
    • Codex: Digital software engineering agent in ChatGPT
    • codex-1: Model powering Codex, fine-tuned o3
    • codex-mini: Smaller version of codex-1, fine-tuned o4-mini

DeepMind: AlphaEvolve

While I mostly laughed away Google's "AI Scientist" claims a few months ago, the newest iteration (while still focussed on math/coding, very much iterative, and slightly overhyped) seems like it may actually cause significant impact in the very near future? Extremely cool!