Parameter Update: 2025-19

"police open up" edition

Parameter Update: 2025-19

What a week! I already assumed we'd be in for a few goodies after the lackluster last two weeks, but I'm still surprised by how cool Claude 4 is, so here wo are.

Sidenote: meinGPT 2.0 🚀

In case you missed it, meinGPT just officially shipped version 2.0 - we've come a long way, and this time around we've launched Assistants with Code Execution, Web Search and MCP Support, automatic PII redaction, new gpt-image-1 image gen and so much more.

If you are interested in trying out the platform, just reach out to me directly - happy to hook you up!

Google I/O

Google IO was this week! And with it came so many announcements, it was honestly hard to keep track:

  • AI in Search is now rolled out by default for everyone - and powered by Gemini 2.5 Flash. No idea how they are making that work with their business model, but damn.
  • Veo 3 is Google's new video generation, and it's actually good now? Just very expensive ($250 for <100 generations?). Auto is crazy weird though!
  • Imagen 4 is Google's new image generation. While it still seems less prompt-adherent than gpt-image (something Google themselves admit openly), it does feel like Google is now encroaching from both sides with the Gemini 2.0 image gen.
  • Gemma 3n is a ~5B parameter model that almost matches Claude 3.7 on LLMarena
  • Gemini Diffusion is, by quite some distance, the best text diffusion model we've seen. It's crazy fast and comes with some fun architectural quirks enabled by diffusion (don't forget: diffusion is spectral autoregression). It does feel like most of the assistant-performance has been distilled from one of the other Geminis though.
  • Jules is a Codex competitior that is free to try(!)
  • Stitch is a generative model for UI designs
  • Some smaller stuff: Gemini in Chrome can now take actions for you, Gemini Live integrates with Google Apps, 2.5 Deep Think is technically the best model out there by some metrics, Live Translation in Google Meet, clothes tryon in Google Shopping (with a custom model!?),...

Claude 4

As mentioned in the intro, Claude 4 is this year's new generation of models by Anthropic. Keeping the exact same API pricing as Claude 3.5 (it seems these things might actually be just a different posttraining from the same base models?), we actually do get the Opus variant this time - and boy are they a treat for Software Engineers! (though they might call the cops on you for misbehaving?)

Mistral

Mistral also seems to have wanted some of the attention this week, announcing both a new document/OCR service and a small new dev model (Devstral). Big new base model when?

Microsoft: Open Source GitHub Copilot

GitHub Copilot is now released openly under MIT license - it seems that they would rather sell you their new Agent mode?

OpenAI

buying io

In the most self-indulgent video I've seen in a hot minute, OpenAI has announced that they acquired Jony Ive's new company, io for an eyewatering $6.4B. While I remain skeptical that their upcoming device won't end up being another Humane Pin, I am still happy for Jony having the opportunity of making bank like that.

o3 in Operator

Operator has been updated to use o3 instead of GPT-4o as a base model and it seems to have gotten much more useful as a result. Bummer, it's still locked to Pro tier though.