Parameter Update: 2025-46

"neuRIPs" edition

Parameter Update: 2025-46

It really feels like we're all still recovering from the last week, and with NeurIPS right around the corner (hi from San Diego btw - get in touch if you're in town!), most people probably have enough on their hands this week as well, so I'm honestly not mad that this one was a little slow.

Anthropic: Claude Opus 4.5

While Gemini 3 is still dominating discourse, this week brought along Anthropics newest "big boy" model - Opus 4.5. The big new, apart from the convenient very, very minor edge it has in coding over Gemini 3 Pro, is the 3x price drop - it seems like the TPU deal Anthropic signed a few weeks ago may already be paying of in alleviating their inference constraints a bit (remember that Google owns ~8.8% of Anthropic!). The other story here is that the model actually has identical rate limits to Sonnet 4.5 in Claude Code but is only available to Max (min. $100/month) subscribers) - weird!

In my initial tests, it felt very good for coding, but actually failed a few easier non-coding tasks, so I am wondering exactly how overfit it may be. I also got slightly less "big model" from it than from Opus 4.1, but I might just be making that up.

DeepSeek V3.2

Just this morning, DeepSeek also gave us their "Gemini-3-Pro-level" model, V3.2. While the release is undoubtedly very impressive, I can't decide what's funnier:

  • The fact that they still absolutely refuse to retire the V3 naming (original V3 was a non-reasoning, GPT-4 level model released in December last year, before the original R1 release!)
  • The release of "V3.2-speciale", a high-reasoning variant that for the first time achieved IMO gold in a model that is actually available (at least for the next two weeks before they'll retire it?) while being named suspiciously like a pizza

Flux 2

This one seemed to have really flown under the radar of a lot people, as I saw very little discussion around it: BFL has finally given us a proper successor to their Flux image gen model.

Now, I assume the reason people aren't as excited about this as they were for the original Flux is because results don't seem to quite match Nano Banana Pro or other proprietary models (which is a fair criticism), but I wouldn't discount the fact that this release also brought us a new -dev variant which may lead to some very cool things down the line!

US: Genesis Mission

In an extremely hype-y press release, the US government announced the "Genesis Mission" - an effort led by the Department of Energy to build an "American Science and Security Platform" composed of multiple governmental organizations and private industry. The verbiage matches the language Altman uses around Project Stargate, framing the endeavor as a "Manhattan Project for AI".

Depending on who you ask, this is either a necessary project to ensure US sovereignty in AI or a pre-emptive bailout for the AI infrastructure boom once private investment starts to slow. So far, there doesn't seem to be a concrete dollar amount attached to it either way, so we'll see!

Launching the Genesis Mission
By the authority vested in me as President by the Constitution and the laws of the United States of America, it is hereby ordered: Section 1. Purpose.

China: GPU sovereignty

While Gamers Nexus recently covered the impact of American export control on the Chinese GPU market in just about as much depth as anyone could ask for, it seems like China is still doing everything they can to work towards independence. Two related announcements that came across my timeline this week:

  1. Huawei Flex:ai - even in a best case scenario, GPUs in large clusters are rarely fully utilized due to scheduling, data bandwidth and similar constraints. Flex:ai claims to address this somehow, improving utilization by up to 30% - read: up to 30% capacity boost overnight, even for existing infra.
  2. Zhonghao Xinying "Ghana" GPTPU - I am slightly more sceptical of this claim, but apparently Chinese start-up "Zhonghao Xinying" has developed a general-purpose accelerator (read: GPU) achieving up to 1.5x Nvidia A100 speed. While the A100 is a couple generations behind at this point, they claim to have done so using "only self-controlled intellectual property for the core design, with no reliance on Western companies, software stacks, or components for development, design, or fabrication". If that's actually true (big if) and they manage to scale manufacturing (similarly big if), this has got to be a big blow to Nvidia (who are already feeling the pressure from Google's TPUs on home turf) and the US government.

OpenAI: Training Data Lawsuits

Thanks, Marc, for flagging this one!
In one of the approximately one bazillion lawsuits OpenAI is facing right now (this one is against The Authors Guild), they are now forced to hand over internal messages and documentation related to the internal "books1" and "books2" datasets used to train GPT-3. The datasets contained around 67 billion tokens of data and were since deleted by OpenAI (which does feel a bit like an admission of guilt, given none of the other training data has been deleted), making it hard to evaluate what exactly went into them. The main question that people seem to be debating now: Did OpenAI know at the time that they were committing criminal copyright infringement? If so, we could be looking at pretty sensible fines.