Parameter Update

Parameter Update: 2025-27

"agent" edition

Gereon Elvers

21 Jul 2025 — 3 min read

Well that certainly was another week, wasn't it? Damn!

OpenAI

agent

While they still haven't released their Open Source model or GPT-5 proper (though we do get more and more leaks these days!), we did get a new announcement this week. agent combines the deep research and Operator (plus, apparently, a new foundation model) into a very interesting tool - that OpenAI did a really poor job showcasing. The model is able to complete long-horizon tasks (or at least what passes as one of those) in its own browser environment, prompting the user to kick back in if input is required.

ChatGPT can now do work for you using its own computer.

Introducing ChatGPT agent—a unified agentic system combining Operator’s action-taking remote browser, deep research’s web synthesis, and ChatGPT’s conversational strengths. pic.twitter.com/7uN2Nc6nBQ
— OpenAI (@OpenAI) July 17, 2025

Bonus: The whole thing is supposed to be available for Plus users starting today, so I'm excited to try it out.

Second place at AtCoder

In the first of two competitions, we saw what was probably the last time humanity prevailed against AI at this specific task. After speed coding for three days straight, Polish super-coder psyho managed to outcompete OpenAI's models - for now, humanity prevails!

IMO drama

In the second of the two competitions, OpenAI has trained a new "general reasoning model" that has achieved a gold medal in the International Math Olympiad. Given that it took Google two specialized models to achieve silver last year, that certainly sounds like a feat. But: Apparently the organizers asked AI companies to hold back on the announcement to give the actual human competitors some time in the limelight, which Google adhered to and OpenAI did not. OpenAI also did not collaborate properly with the organizers, so they don't even get an official medal to show of. As Taran Tong noted that it would be a bit premature to comment on the performance of unsupervised AI without declaring the experimental setup in advance.

Terence Tao on the supposed Gold from OpenAI at IMO pic.twitter.com/xNW1yANSuM
— Rota (@pli_cachete) July 19, 2025

Mistral

Voxtral

Mistral has announced Voxtral, their new speech-to-text model. Available in three sizes, each one significantly pushes the frontier in speech-to-text. They are planning on working with partners to support use cases like diarization, which might end up making this a serious competitor for the likes of ElevenLabs (at least on some fronts).

Le Chat Updates

Besides their new model, Mistral also shipped some new quality-of-life improvements to Le Chat, including speech-to-text using Voxtral, multilingual reasoning using their "Magistral" reasoning model, Deep Research and image generation/editing using Flux Kontext. I'm always surprised by how they manage to keep up, but still don't know anyone using Le Chat regularly, so not sure how I feel about most of this.

Google

Google's new video generation model, Veo 3, is now available over their API. It's pretty pricy at $0.75 per second with audio, but this feels like the first time one of these models might be useful for anything other than cool demos and shitposting.

Windsurf Drama: Part 2

The Windsurf drama continues! After last weeks revelation that the CEO and most devs would move over to Google, the rest of the company has now been acquired by Cognition - the people behind "autonomous software engineer" Devin. It appears that most people got a pretty good payout. Additionally, WIndsurf announced a bunch of new features and integrations right of the bat (titled "Wave 11) - insane speed of movement here!

Cognition has signed a definitive agreement to acquire Windsurf.

The acquisition includes Windsurf’s IP, product, trademark and brand, and strong business. Above all, it includes Windsurf’s world-class people, whom we’re privileged to welcome to our team.

We are also honoring… pic.twitter.com/N2HX0Mzz65
— Cognition (@cognition_labs) July 14, 2025

Personnel changes

It seems that Cursor can't catch a break these days. While they are still struggling under competition from Claude Code (potentially tanking their valuation) and users mad at their price hikes, they also lost two key talents they snagged from Anthropic less than two weeks ago. It's unclear to me what exactly went down, but this isn't exactly a great look.

BREAKING: Claude Code PMs Boris Cherny and Cat Wu have returned to Anthropic after a brief stint at Cursor. pic.twitter.com/GGcNHfppMM
— TBPN (@tbpn) July 16, 2025