June 2, 20268 min read

Maybe We Were Wrong About AI Work

AI may not replace work in one dramatic wave; the sharper skill may be knowing how to route work between humans, small models, frontier models, code, caches, and escalation paths.

For the last two years, the AI story was sold as a labor story.

Developers would vanish. Analysts would vanish. Junior white-collar jobs would vanish. Customer support teams would shrink. Marketing teams would become one person and a prompt box. Every company would become leaner, faster, and strangely proud of having fewer humans around.

The scarier version came from the people building the systems. Anthropic CEO Dario Amodei told Axios that AI could "wipe out half of all entry-level white-collar jobs" and push unemployment to 10-20% in one to five years. He said the industry needed to stop "sugar-coating" what he called a "white-collar bloodbath."

Those lines travelled fast because they sounded like the future had already been decided.

But now another story is catching up with it: AI is not just powerful. AI is expensive. And the bill is starting to ask questions.

Goldman Sachs Research expects agentic AI to drive a "24-fold increase in token consumption" by 2030. The reason is simple. A chatbot answers once. An agent loops, plans, calls tools, reads files, retries, verifies, and often does the same thing several times before it gets anywhere useful. Goldman described the jump as "blowing it up 10-fold, 20-fold, 50-fold."

That sounds exciting if you sell chips, cloud, or model access. It sounds less exciting if you are the company paying the invoice.

Axios recently described the mood as AI "sticker shock." Microsoft has reportedly pulled back from some Claude Code licenses. Uber's internal AI tooling spend became a cautionary story after the company reportedly burned through its annual Claude Code budget by April. And Uber COO Andrew Macdonald gave the most useful line in this whole debate. Asked whether all that AI usage was clearly turning into better consumer products, he said: "That link is not there yet, right?"

He also said the "headline stats make your head explode."

That is the reset. Not whether AI is useful. It obviously is. The question is whether more AI usage automatically means more value.

Right now, many companies are still measuring the wrong thing. Token usage went up. AI-assisted commits went up. Prompt volume went up. Internal dashboards look impressive. But "we used more AI" is not a business result. It is closer to saying your team sent more emails.

The better question is: what moved above the line because of those tokens?

Did a product ship that would not have shipped? Did a support queue shrink without making customers angrier? Did the sales team close better accounts? Did engineers solve harder problems, or did they just generate more code to review?

This is where the job-apocalypse story starts to look too simple.

OpenAI CEO Sam Altman recently admitted that he was "delighted to be wrong about this" when discussing the speed of white-collar job losses. His newer view is that there is still a "human part" of work that people care about. Goldman Sachs CEO David Solomon made a similar point about banking. The idea that AI adoption simply means fewer workers is a "very simple media narrative," he told Axios.

That does not mean nobody loses work. Some work will get crushed.

If your job is mostly copy-paste, formatting, summarizing, timestamping, routing, invoice extraction, templated replies, SEO variations, CRM cleanup, or first-pass research, AI is not a side tool. It is a direct competitor. Those tasks are going to be compressed. Some roles built around those tasks will disappear.

But that is different from saying every job disappears. Work rarely disappears cleanly. It changes shape. The spreadsheet did not kill analysts. It changed the baseline for being an analyst. Email did not kill managers. It made management faster, noisier, and harder to hide from.

AI will do the same thing, with one extra twist: intelligence now has a meter attached to it.

That meter changes everything.

The winners will not be the people who use the biggest model for every task. They will be the people who know when not to.

A frontier model is amazing when the task is ambiguous, high-stakes, creative, strategic, or full of hidden edge cases. Use it for architecture. Use it for legal-sensitive review. Use it for product judgment. Use it when the question is not "can you generate text?" but "what are we missing?"

But most work is not like that.

Most work is smaller. Extract these fields. Classify this ticket. Rewrite this note in our house style. Turn this transcript into timestamps. Convert this schema. Draft the first version. Check whether this answer follows policy. Find the duplicate records. Summarize the call. Clean the spreadsheet.

Those tasks do not need a genius every time. They need a reliable worker.

So the practical answer is not one giant model doing everything. It is a multi-model agentic system that routes work by difficulty, cost, risk, and user context.

For heavy use cases, use the strongest model. For routine use cases, use smaller models. For repeated personal workflows, fine-tune or adapt a cheaper model around that person's actual usage. For deterministic steps, use code. For repeated context, cache it. For low-risk outputs, automate. For sensitive outputs, escalate to a stronger model and a human.

The point is not to be cheap. The point is to stop being wasteful.

A content team does not need a frontier model to create YouTube timestamps, captions, clip candidates, and metadata. A smaller model can do that. Save the expensive model for the actual judgment: What is the angle? What should the video be called? What would make someone click without feeling cheated?

A software team does not need the strongest coding model to rename variables, create boilerplate tests, or convert a JSON schema. Use a smaller model there. Save the frontier model for system design, security review, performance bottlenecks, and the "what breaks in production?" pass.

A support team does not need a premium model for every ticket. Use rules and smaller models for intent classification. Use retrieval and a mid-tier model for normal answers. Escalate angry, high-value, legally sensitive, or ambiguous cases to the strongest model plus a human.

A founder does not need to ask a giant model every five minutes to "build the whole SaaS." The better workflow is staged. Use AI to explore interface options, compare tradeoffs, mock flows, find brittle assumptions, and pressure-test decisions. Then let a competent human decide what the product should be.

This is also where personalization matters. A generic model does not know how you work unless you keep paying to remind it. That is a waste.

If a founder repeatedly writes investor updates in the same voice, tune a smaller model for that. If a recruiter repeatedly screens candidates against the same role patterns, tune for that. If a creator repeatedly turns long videos into shorts, tune for that. If a developer repeatedly writes migrations, test fixtures, or docs in the same repo style, tune for that.

The expensive model should not be your default intern. It should be your escalation path.

This is the next AI skill: not prompting, but routing.

Can you split a workflow into cheap steps and expensive steps? Can you define when a task needs deep reasoning and when it needs formatting? Can you measure quality per token, not just output per minute? Can you tell when AI is making work faster and when it is only making the dashboard look busy?

That is the real moat.

Not AI usage. AI judgment.

Companies will slowly stop asking, "How many tokens did we spend?" and start asking, "What did those tokens produce?" The best teams will not have the biggest model bill. They will have the clearest routing, the smallest useful context, the sharpest evaluations, and the discipline to send only the right work to the expensive model.

Maybe AI does not take your job in one dramatic wave.

Maybe it takes the lazy parts first.

And maybe the person who beats you is not the one using AI the most, but the one using it with the least waste.

In India, we already have a word for this kind of practical intelligence: jugaad. Not as a shortcut. Not as cheapness. As clever constraint-aware building.

So if there is a name for the next phase, maybe it is this: Jugaad LLM programming.

Use the giant model when the work deserves it. Use the smaller model when it is enough. Fine-tune around the user. Cache what repeats. Escalate what matters.

Respect the bill.

Source notes

Axios: Behind the Curtain: A white-collar bloodbath - Dario Amodei's warning about entry-level white-collar jobs and unemployment.
Reuters via The Star: OpenAI's Altman says AI unlikely to lead to 'jobs apocalypse' - Sam Altman's updated comments on job losses and the human part of work.
Goldman Sachs: AI Agents Forecast to Boost Tech Cash Flow as Usage Soars - token consumption forecast and agentic AI cost dynamics.
Tom's Hardware: Uber chief warns no link yet between AI tokenmaxxing and shipping successful products - Andrew Macdonald's comments on AI usage, tokens, and consumer product impact.
Axios: AI isn't taking banking jobs, Goldman Sachs CEO says - David Solomon's comments on AI productivity and headcount assumptions.
Axios: AI sticker shock hits corporate America - enterprise AI cost pressure, tokenmaxxing, and ROI concerns.