Parameter Update

Parameter Update: 2025-18

"groking" edition

Gereon Elvers

19 May 2025 — 2 min read

Apart from some self-inflicted wounds on the Grok team, it seems that we've had one of the slower news weeks of the year, but with I/O next week and Anthropic and OpenAI rumours flying around, that is unlikely to be the case too much longer!

Grok: Rogue employee modifying system prompt (again)

After we've just recovered from the OpenAI sycophancy debate, xAI seemed hellbent on pulling us back in for more discourse, in the most stupid way imaginable: Having Grok constantly bring up Elon's favorite talking point of the week, White Genocide in South Africa. In basically every response.

Oh my god Elon programmed Grok to talk about white genocide pic.twitter.com/yEcfYY5pel
— evan loves worf (@esjesjesj) May 14, 2025

After immediate blowback, xAI's defence amounted to the same excuse they pulled last time: An unauthorized change made by a rogue employee at 3:15 AM.

We want to update you on an incident that happened with our Grok response bot on X yesterday.

What happened:
On May 14 at approximately 3:15 AM PST, an unauthorized modification was made to the Grok response bot's prompt on X. This change, which directed Grok to provide a…
— xAI (@xai) May 16, 2025

While I personally agree with most Twitter user that this was very clearly (and embarrasingly) Elon's doing, it once again highlights the primary theme I brought up last week during the OpenAI drama: Lack of accountability.

While this attempt at manipulation was stupid and easily caught, that may not always be the case. And as such, even xAI's attempt to spin this controversy into something positive by publishing their current System Prompt moving forward so they could be held accountable, seems poorly thought out given

the current release leaves out many placeholders that could easily be abused,
there is no way to verify that the published prompt is actually the one being used, and
they have a history of abandoning transparency when people stop paying attention).

OpenAI: Codex

In less controversial news, OpenAI has finally unveiled their long-awaited Software Engineering Agent, called... Codex (lol)

Naming aside, the setup actually seems really cool: A fine-tuned variant of o3 get's to work in a sandbox environment linked to an existing GitHub repo and, once it has finished its work, opens up a pull request. As a user, all you have to do is launch 2-3 of these agents in parallel, pick the best results and merge.

Coming back to names for a second, that also means, though, that there are now a total of five things named Codex released by OpenAI (four of which were released in the past few weeks):

2021: "Codex" - fine-tuned GPT-3 for Coding
April 2025: Codex CLI (Claude Code but Open Source)
May 2025:
- Codex: Digital software engineering agent in ChatGPT
- codex-1: Model powering Codex, fine-tuned o3
- codex-mini: Smaller version of codex-1, fine-tuned o4-mini

DeepMind: AlphaEvolve

While I mostly laughed away Google's "AI Scientist" claims a few months ago, the newest iteration (while still focussed on math/coding, very much iterative, and slightly overhyped) seems like it may actually cause significant impact in the very near future? Extremely cool!