'Inference whales' are breaching AI coding startup business models

A Humpback whale breachingReuters/Imago ImagesAI coding services face surging costs from heavy users, prompting price changes.Instead of falling, inference costs are rising, straining business models based on fixed pricing.Anthropic and Cursor adjust pricing to manage costs, reflecting industrywide challenges.The AI coding sector has a problem.Heavy users of AI coding services have been racking up huge costs, forcing some leading startups to overhaul their pricing structures and offerings to avoid big losses."Inference whales," as some in the business call these customers, are making industry insiders question whether AI products that are just "reselling inference" can survive long-term.Inference refers to how AI models are run. Newer reasoning models break user requests down into multiple steps, which increases inference costs. When applied to AI coding services, where developers set automated agents to longer-term tasks, expenses can soar quickly.This is a problem for AI coding services because they're often offered through monthly subscription plans. Many plans allow unlimited use for a fixed monthly fee, and a few users have taken advantage by bombarding the services with huge projects.These startups must still pay for the underlying AI models, so they're getting squeezed between a relatively fixed revenue stream and rapidly rising backend costs."If you're purely reselling AI inference, your business could be very fragile and vulnerable, because the winds can shift violently," said Eric Simons, CEO of StackBlitz, and startup that offers a popular AI coding service called Bolt.new.Claude Code whalesA Bowhead whale breaching, from Jardine's "Naturalist's Library"Reuters/Science Photo LibraryAnthropic offered its popular Claude Code service through a $200 a month unlimited plan earlier this year. Some subscribers went berserk, using thousands of dollars' worth of AI inference over a few weeks or months.Someone even built a website to rank these AI coding whales. The Claude Code Leaderboard lists one developer at the top who's burned through almost 11 billion tokens.Tokens are how AI models break queries down into digestible data chunks. Industry pricing is based on how many tokens are processed. That top-ranked developer's token usage costs almost $35,000, according to this leaderboard.That compares to the $200 a month he's been charged. Even if that's over a whole year, Anthropic would be getting about $2,400, while incurring much higher inference costs.Anthropic is changing its pricingThat's clearly unsustainable, so Anthropic plans to change its pricing. The $200 a month plan will stay, but the startup will introduce weekly rate limits, starting August 28.If users blow through these new weekly rate limits, they will have to buy additional capacity."We've identified extreme usage by a small number of customers that impacts capacity for our broader community," an Anthropic spokesperson told Business Insider.The startup said it's also seen "policy violations," such as account sharing and reselling access."We're committed to supporting advanced use cases long-term, but need to ensure consistent performance for all developers in the meantime," the Anthropic spokesperson added.A Swedish whaleI tracked down one of the whales near the top of the Claude Code Leaderboard.Albert Örwall, a developer based in Sweden, said he's been using the $200 a month Claude Code subscription to build his own vibe-coding platform, along with some open-source agentic tools. "I was probably running 3 to 4 fairly long-running tasks in parallel constantly while I was working, and that's when it really took off," he said of his Claude Code usage. Even excluding these big projects, Örwall said his regular workflow in Claude Code likely racks up inference costs of $500 per day, under a subscription that costs only $200 a month."So I'm guessing my workflow might not be sustainable for Anthropic," he added. Cursor responded, tooWhen Anthropic's new pricing kicks in, Örwall said he'll keep the $200 a month subscription for a while to get a feel for what the weekly limits actually mean for his budget."I'll avoid paying anything beyond the $200 subscription," he said, noting that he can change how he writes code and develops projects to avoid breaching the new rate limits."The reason I originally switched from Cursor to Claude Code was because usage-based pricing became too expensive in Cursor," Örwall added.Cursor is another popular AI coding service, which often uses Anthropic's AI models as the underlying intelligence powering its product. Cursor recently switched its $20 a month Pro plan from unlimited requests to a tiered system with usage-based pricing for "fast" requests, meaning users are charged extra for exceeding a certain limit.This change, coupled with a lack of clear communication, caused confusion and frustration among some users who expected unlimited usage.Cursor announced the initial change in mid-June. Then it updated with more details about 2 weeks later, and then again in early July."New models can spend more tokens per request on longer-horizon tasks," the startup wrote in a blog post, apologizing for surprising users with unexpected new bills."Though most users' costs have stayed fairly constant, the hardest requests cost an order of magnitude more than simple ones."Inference costs aren't fallingThe assumption across the industry has been that inference costs will drop dramatically, making these AI coding services more financially viable.However, in practice, this hasn't happened thus far. Instead, when a new top AI model comes out, all the AI coding services integrate it — along with its higher prices."This is the first faulty pillar of the 'costs will drop' strategy," Ethan Ding, CEO of startup TextQL, wrote in a recent blog. "Demand exists for 'the best language model,' period. And the best model always costs about the same, because that's what the edge of inference costs today."Developers and other AI users usually want the best, not last month's leading intelligence. "Nobody opens Claude and thinks, 'you know what? let me use the shitty version to save my boss some money.' We're cognitively greedy creatures," Ding wrote. "We want the best brain we can get."Even when inference costs do fall, the rise of agentic AI workflows means that developers set up longer, automated projects that generate a lot more tokens.If a project uses 100 million tokens, rather than 1 million, the initiative's cost remains high, even if per-token prices may have fallen."A $20/month subscription cannot even support a user making a single $1 deep research run a day," Ding said. "But that's exactly what we're racing toward. Every improvement in model capability is an improvement in how much compute they can meaningfully consume.""There's no way to offer unlimited usage in this new world under any subscription model," he added. "The math has fundamentally broken."Sign up for BI's Tech Memo newsletter here. Reach out to me via email at [email protected] the original article on Business Insider

Comments

Business News

Designer apologizes for Adidas sandal design

ABC News

Cristiano Ronaldo Proposed With 50-Carat Diamond Worth An Estimated $3 Million: Here’s How It Compares With Other Famous Wedding Rings

about 1 hour ago

FTSE 100 boosted by US-China trade extension

Evening Standard

I built a media business after starring on '16 & Pregnant.' But I'm still afraid that I could lose my financial success in an instant.

about 3 hours ago

insider

Investigation finds $13 million US military Reaper drone crashed into the sea after an unexplained failure saw its propeller fall off

about 3 hours ago

Tesla is willing to pay up to $33.66 an hour for robotaxi test operators in NYC. Here's what the job entails.

insider

Comments

Business News

Designer apologizes for Adidas sandal design

Overseas travel to US continues to tumble as Trump imposes travel bans, tariffs

Trump blasts Goldman Sachs CEO David Solomon over bank’s tariffs warning: ‘Focus on being a DJ’

Only 12% of people find Sydney Sweeney’s American Eagle ‘great jeans’ ad offensive: poll

IBM, Google claim quantum computers are almost here after major breakthroughs: ‘It doesn’t feel like a dream anymore’

Restaurants from IHOP to Chipotle sound alarm over tariffs spooking customers into staying home

AI Firm Perplexity Makes $34.5 Billion Bid For Google’s Chrome Browser

Can We Talk? Joan Rivers Honored With ‘Joke File’ Exhibit At The National Comedy Center

How Fantasy Football Can Provide Real Workplace Benefits

More Evidence Your Doctor’s Working Harder Than Ever

Cristiano Ronaldo Proposed With 50-Carat Diamond Worth An Estimated $3 Million: Here’s How It Compares With Other Famous Wedding Rings

FTSE 100 boosted by US-China trade extension

Spirit Airlines sounds the alarm on its future ability to stay in business

Vernon food plant pleads guilty after nitrogen leak kills two workers

Perplexity AI makes wild $34.5B all-cash offer for Google Chrome browser

Will The Supreme Court Overturn Same-Sex Marriage? Maybe—But It Hasn’t Done Anything Yet.

Northern Lights Forecast: 10 States Might See Aurora Borealis Tonight

Jimmy Kimmel Says Trump Presidency Drove Him To Italian Citizenship

Oasis Sends Several Albums Back To The Charts As The Band’s Reunion Tour Continues

James Gunn’s ‘Superman’ New On Streaming This Week, Director Says

Olivia Rodrigo Makes It To A Special Milestone — Again

Democrats Condemn Trump’s D.C. Takeover Plans As Federal Agents Begin Patrolling Capital

Reneé Rapp’s Superstar Moment Has Arrived

CPI, Jobs Report Leave Fed Little Choice But To Cut Rates

‘KPop Demon Hunters’ Sing-Along Version Coming To Theaters

Lenyn Sosa Quietly Leads A Step Forward For White Sox Hitters

Taylor Swift Shares A Major Hint About Her New Album

Hulu’s Best New Show Has A Near-Perfect Rotten Tomatoes Score

Danielle Spencer, Child Star Of ‘What’s Happening!!’, Dies At 60

What Time Does ‘The Summer I Turned Pretty’ Season 3, Episode 6 Come Out?

Trump Might Greenlight ‘Major Lawsuit’ Against Fed Chair Jerome Powell

Metallica And Queen Manage The Same Feat

Lady Gaga’s New Album Dances Back To The Charts

Real Madrid 2025/26 Season Preview: Team News, Key Dates, Predictions

River Island's £35 'Taylor Swift' dress that was £85 matches new 'Showgirl' era

Stay cool in heatwave with £28 smart fan which Wowcher just made cheaper

Thousands of motorists waste time of emergency service operators every year dialling 999 accidentally

I built a media business after starring on '16 & Pregnant.' But I'm still afraid that I could lose my financial success in an instant.

Explore Business Insider's front-row coverage on the biggest nights in music, Hollywood, and fashion

Elon Musk and Sam Altman's feud is really heating up

A complete timeline of Taylor Swift and Travis Kelce's relationship

'Inference whales' are breaching AI coding startup business models

Microsoft is trying to poach Meta AI talent and offering multimillion-dollar pay packages, internal documents show

I'm stuck between expensive teens and aging parents. I won't have a carefree time in my life.

Taylor Swift pulled out pricey accessories to announce her next era, including an $11,400 necklace

In 1954, an Alabama woman became the first known person to be directly hit by a meteorite &mdash; here's her strange story

I've been to over 90 countries, but there's only one I want to return to again and again

Investigation finds $13 million US military Reaper drone crashed into the sea after an unexplained failure saw its propeller fall off

Tesla is willing to pay up to $33.66 an hour for robotaxi test operators in NYC. Here's what the job entails.

Netflix is taking its fight with Disney to a new arena: sing-alongs

JD Vance says he used Grok to help make a customized children's book

Kodak cautions there's 'substantial doubt' about its ability to stay in business

French President Emmanuel Macron, his wife hired private eye to dig into Candace Owens: report

Kodak warns it may not stay afloat much longer as photography giant’s shares plunge 25%

Spirit Airlines could soon go out of business — months after declaring bankruptcy

Disgraced crypto exec Do Kwon to plead guilty to fraud charges in $40B collapse

Demi Lovato Misses The Mark With Her Comeback Single

There’s A New ‘KPop Demon Hunters’ Song You Haven’t Heard

Tony Hale On Voicing Ultron In ‘Marvel’s Iron Man And His Awesome Friends’

Bobby Witt Jr. Can’t Do It All By Himself For The Stagnant Royals

In 1954, an Alabama woman became the first known person to be directly hit by a meteorite — here's her strange story