Not every AI prompt needs Opus 4.8.

As the fervor of tokenmaxxing dies down, some AI users are wondering how to get more bang for their buck and keep their monthly costs in check. Coinbase CEO Brian Armstrong shared the crypto company’s strategy: not skimping on the cheaper models.

“We’re working hard on routing prompts to cheaper models where appropriate, and in some cases have been able to keep costs roughly flat, while token usage continues to grow exponentially,” Armstrong wrote on X on Sunday.

While the latest models like Opus 4.8 or GPT-5.5 promise bleeding-edge benefits, they can also devour more tokens. (That’s before you turn on Fast mode.) When Anthropic launchedi Opus 4.7, many users complained that they were quickly hitting rate limits.

Armstrong wrote that he anticipated “80% of workloads will be running on 99% cheaper models within 12-18 months.”

The only times when users will use the latest models, Armstrong predicted, are when they need to be “IQ maxing.” This includes scientific breakthroughs or agent orchestration.

“This leads me to think the limiting factor will be energy and compute, not better models,” Armstrong wrote.

The Coinbase CEO’s post caught the attention of some tech luminaries. Venture capitalist Marc Andreessen called it “interesting.” Hugging Face cofounder Julien Chaumond wrote that “model routing is growing a lot these days.”

Box CEO Aaron Levie wrote that Armstrong’s numbers were a “bit extreme,” but that AI use would likely stratify in the coming years. “High end” work will be completed by leading models, Levie wrote, while “high volume” work will be relegated to the cheap models.

“Intelligence allocation is going to be extremely important,” Harvey cofounder Winston Weinberg wrote.

The efficiency mindset is relatively new — or at least new to publicly flaunt. Not long ago, when tokenmaxxing was all the rage, tech leaders would post their high token bills or flex their usage of the latest models.

That mindset is especially popular in the startup space, where Y Combinator CEO Garry Tan advises founders to “let it rip” with tokens. Lance Yan, a YC-backed startup founder, told Business Insider in April that rationing tokens was “stupid.”

The tide seems to be turning. Glean cofounder Tony Gentilcore commented that Armstrong’s post was “spot on.”

“Everyone technical already knows this,” Gentilcore wrote. “The financial markets are the only ones extrapolating out Opus prices to infinite scale.”



Read the full article here

Share.
Leave A Reply

Exit mobile version