Parameter Update: 2026-24
"better call sol" edition
The US government has ensured the new OpenAI launch is sufficiently interesting. Not much going on besides that.
OpenAI
GPT-5.6
Since the Claude Fable release, OpenAI's market position has really only been saved by the US government's intervention. This week, the finally responded with a new model drop. GPT-5.6 updates their naming scheme - gone are the "-mini" and "-nano" suffixes, replaced with "Sol", "Terra" and "Luna" for the three available model sizes (somehow even more confusing than the old name, but at least a little cooler?), and provides a solid step up in capabilities.
Sol is the biggest of the three models, providing performance "close to" Mythos for the same per-token price as GPT-5.5. It wins in TerminalBench, which is the headline metric OpenAI chose to highlight, but lacks behind slightly in most other benchmarks - even in their new "ultra" reasoning mode, which appears to be OpenAI's response to Anthropic's "workflows" feature. The system card unfortunately provides very few details that aren't safety-related, so we'll have to take their word for it for now.
Luna roughly matches GPT-5.5 at half the price, while having worse token efficiency, leading to some very fun benchmarks where the two actually match in final price, with Luna just taking significantly longer.
Terra is positioned as a "fast and affordable model", which means I expect it to be useless for most productive tasks but great for batch processing.
Introducing a limited preview of GPT-5.6 Sol, our next generation frontier model, as well as GPT-5.6 Terra, a balanced model for efficient, everyday work, and GPT-5.6 Luna, a fast and affordable model for high-volume work.https://t.co/OoM83SyISN
— OpenAI (@OpenAI) June 26, 2026
While Anthropic has now finally been allowed to re-release Fable to a small portion of national security organizations around the US government, OpenAI received more notice, planning out a staggered release strategy where each individual customer needs to be approved by the Trump administration. Ever the diplomat, Altman called with not "quite the process that we think is optimal".
Good new first: Sol is a smart, efficient, and a significant step forward. It is the same price as GPT-5.5. Also launching in the GPT-5.6 family is Terra, with 5.5-level performance at half the price.
— Sam Altman (@sama) June 26, 2026
Bad news: at the request of the US government, it is launching today in…
The thing to watch with these restrictions is the loss of soft power that goes along with them. Similarly to the ML engineering restrictions Anthropic silently placed on Fable, there is a loss of trust that comes along with them, that will cause some organizations to shift to non-US alternatives. I am unconvinced this is the optimal strategy for anyone involved, but I wish them nothing but success.
"the share of tokens used for US models on OpenRouter has collapsed": Bloomberg pic.twitter.com/bG8HvnU4Vl
— zerohedge (@zerohedge) June 25, 2026
Jalapeño
In line with their new model release, OpenAI also announced first in-house chip, engineered in collaboration with Broadcom. It's specifically tuned for LLM inference (interestingly enough, they specify explicitly that it's designed for general LLMs, not OpenAI's specific deployment stack). While we got very few details, we know a few things:
- This is the first chip in a multi-year strategy, where we can expect to see chips released in conjunction with new models
- Performance-per-watt is "substantially better than current state-of-the-art", and I expect it to be the primary metric worth optimizing for
- Engineering samples are already serving GPT‑5.3‑Codex‑Spark, which is a very high-throughput model
Strategically, this is an important integrations step for OpenAI. Diversification away from Nvidia has traditionally meant buying AMD or hoping Google will give you TPUs - both of which came with significant compromises. Expect a broader rollout, and hopefully more details, over the next couple months.
team cooked, spicily https://t.co/yLb6nqrus6
— Sam Altman (@sama) June 26, 2026
Claude Tag
Anthropic has a history of announcing seemingly small things that end up reshaping the AI engineering stack. I remember dismissing both MCP and Skills when they first came out, and Claude Code was explicitly framed as an engineering experiment. Claude Tag might be the next release in this spirit.
You can think of it as a Claude Code integration into Slack, but that might be missing the reframing they want to achieve: Instead of seeing Claude as a collaborator for individual developers, it becomes a stand-along team member, with its own credentials, persistent memory, and audit log.
Introducing Claude Tag, a new way for teams to work with Claude.
— Claude (@claudeai) June 23, 2026
In Slack, Claude joins as a team member with access to the channels and tools you choose. Tag Claude in and delegate tasks to it while you focus on other work. pic.twitter.com/R2C6A5Kcye
According to Anthropic, 65% of their Product team's code now comes from Claude Tag, which might very well be cherry-picked, but could also indicate Tag works really well in a shared environment with technical + non-technical people.
Personally, I am also a bit sketched out by the amount of lock-in a properly tuned instance of this might lead to - if all your organizational engineering knowledge is locked into Claude, you'll have a hard time kicking it out if Anthropic stops playing ball.
Sakana Fugu / OpenRouter Fusion
I missed covering this last week, but it's a trend worth talking about: With OpenRouter Fusion and Sakana Fugu, we've seen the sequential release of two very capable "Fusion" model systems, that both claim to match Mytho/Fable in general capabilities.
OpenRouter:
Introducing the Fusion API, the smartest compound model in the market.
— OpenRouter (@OpenRouter) June 13, 2026
Fusion achieves Fable-level intelligence at half the price.
How it works 👇 pic.twitter.com/OTUQAdTQjU
Sakana:
Fugu stands shoulder-to-shoulder with leading models like Fable and Mythos across the industry's most rigorous engineering, scientific, and reasoning benchmarks.
— Sakana AI (@SakanaAILabs) June 22, 2026
Read the full blog: https://t.co/JqPwOUToGQ
Beyond Bigger Models: Why are Orchestration Models the Next Frontier… pic.twitter.com/OzG7VLjpV1
The idea behind both of these systems is relatively simple: orchestrate multiple models (sequentially and in parallel), let them vote on each others responses, and condense the consensus into a single response. Treat the whole system as a black box that, externally, looks like a single model.
It's not surprising that this works - we've known for a whole that there's a tradeoff between compute/latency and performance, and this is really just pushing that frontier, but that doesn't make it less cool.
That being said, I am skeptical that these systems enable things in the real world that weren't doable beforehand. If none of the orchestrated models is uniquely capable of something, the ensemble also won't be. And treating an orchestrated set of models as a single entity also introduces another layer of fragility - I'd be very interested in seeing someone actually build a production application using one of these in a regular coding harness to verify the performance/latency trade-off makes sense.
I have been trying Sakana Fugu Ultra-high and, first, it is incredibly slow: my typical coding tests (shaders, interactive scenes) take 30 minutes to run
— Ethan Mollick (@emollick) June 22, 2026
And the results are... fine. It does not match Fable in real use.
Its harbor is a good example: https://t.co/xVqulPBsQf https://t.co/KJRLIlSJfX
Again, this isn't meant to take away from the fact that these systems are really cool and can probably do really cool things. It just feels like comparing a cluster of gaming GPUs with a Blackwell GPU.