What Uber's $1,500 AI cap teaches us about goal-setting
Key Takeaway: The mistake leaders keep making is setting goals on outputs (activities, things you do) instead of outcomes (results, things that happen as a consequence). Uber put 5,000 engineers on uncapped AI tooling, ranked teams by tokens burned, watched usage shoot from 32% to 84%, and then their COO had to admit he couldn't draw a line between any of that spend and a single consumer improvement. Token usage is an output. The product getting better is an outcome. When you make the output the goal, teams optimize the output. The number goes up. The thing you actually wanted barely moves. This isn't an AI problem. It's a goal-setting problem, and it happens every time leaders measure how busy the team is instead of what the team delivered.
Uber's AI mandate is going to end up in goal-setting textbooks, and not in a good way. The story goes like this. Leadership put around 5,000 engineers on enterprise AI tooling with metered, uncapped billing on top of the seat fee. They built an internal leaderboard ranking teams by how many tokens they burned. Adoption of agentic coding shot from 32% in February to 84% in March. By month four, the year's budget was gone. The CTO had spent $1,200 in a two-hour demo. Average engineers were running $150 to $250 a month. Heavy users were running $500 to $2,000.
Then their COO, Andrew Macdonald, said the quiet part out loud in an interview: it was "very hard to draw a line between the token spend and actual consumer improvements." Uber eventually capped engineers at $1,500 a month per tool. The leaderboard had done its job, just not the job leadership thought it was doing.
Uber's AI rollout is now the most expensive recent illustration of why it's important to differentiate between outputs and outcomes. And why business goals should focus on results instead of activities.
Outputs vs outcomes, in one minute
The cleanest definition I know is this: an output is something you do, an outcome is something that happens as a consequence of what you do.
A few examples to make it concrete:
- Output: I had a demo with a prospect. Outcome: I closed a customer.
- Output: We ran a €200K AdWords campaign. Outcome: We generated 1,000 qualified leads.
- Output: The Sales team made 1,000 outbound calls this week. Outcome: The Sales team booked 40 qualified meetings.
- Output: Engineering deployed Claude Code to all 5,000 developers. Outcome: Cycle time from ticket to shipped PR dropped from 9 days to 4 days.
In every pair, the output is the activity, the work, the thing the team did. The outcome is the result, the consequence, the thing the business actually wanted in the first place.
Outputs are easy to measure. They're countable, they update in real time, they look impressive in a deck. Outcomes are harder. They lag the work by days, weeks, or months. They depend on multiple variables. They're harder to attribute. So under pressure, leaders default to measuring the output, even when they know the outcome is what they actually care about.
That's the trap. And once you've set the goal on the output, the team optimizes the output. They'd be irrational not to. You told them that's what success looks like.
Uber's AI rollout, translated
Look at Uber's setup through this lens and the failure is obvious in retrospect.
The goal they communicated was usage. The success metric was tokens burned. The reinforcement was a leaderboard ranking teams by that metric. Everything in the system pointed at the output, which was "use AI a lot." Nothing in the system pointed at any outcome, which would have been something like "ship features faster," "reduce defect rate," "cut support load," or "lower engineer-hours per shipped feature."
So the teams did exactly what the goal told them to do. They used AI a lot. Adoption tripled in 30 days. Token spend exploded. The leaderboard turned green. The CTO ran up a $1,200 tab in two hours. By every metric Uber had defined, the program was a runaway success.
Then someone asked the awkward question, which is what every leader eventually has to ask: did any of this actually help the product? And the honest answer was: we don't know, and we can't tell. Macdonald's quote about not being able to draw a line between spend and consumer improvements is the polite version of "we measured the output and forgot to measure the outcome."
The cap that followed wasn't really about cost control. It was about the realization that the goal had been set on the wrong thing. Capping engineers at $1,500 was a way of saying "stop optimizing the metric we accidentally made important."
The pattern is everywhere, not just Uber
Uber is the high-profile case, but it's not the only one.
At Meta, an employee built an internal leaderboard called Claudeonomics that ranked roughly 85,000 workers by token consumption. The dashboard logged 60 trillion tokens in 30 days. Leadership publicly cheered the numbers as a productivity signal. When the dashboard leaked outside the company, it was pulled within two days.
At Amazon, an internal usage leaderboard came with an explicit promise that token stats would not count toward performance reviews. Employees gamed it anyway, because nobody believed the promise.
At Duolingo, leadership tied performance evaluations to AI adoption directly. They reversed the policy after staff pointed out it was rewarding tool usage instead of actual results.
And there's the now-famous Garry Tan anecdote from Y Combinator: a developer shipped 37,000 lines of code a day across 5 projects for a 72-day streak. A while later, another developer reported finding 78,400 lines of AI-generated slop in production. Two numbers, both impressive on a leaderboard, both telling you almost nothing about whether the product actually improved.
Each company made the same mistake. The metric they cheered was an output. The thing they actually wanted was an outcome. In none of these cases did the activity number tell anyone whether the business got better.
Why output-based goals quietly destroy progress
When you set a goal on an output and that output happens to move, you get a false impression of progress. Token spend goes up, lines of code go up, demos delivered goes up, adoption percentage goes up. Everything on the dashboard looks great. Leadership tells the board the AI rollout is working. The board tells investors the AI rollout is working. Everybody believes the story because the numbers all moved.
Meanwhile, the actual business may or may not have moved at all. And here's the worst part: you can't tell, because you weren't measuring the outcome. You were measuring the output. By the time you notice, you've burned the budget, exhausted the team, and lost a year you can't get back.
There's a second, more insidious problem. Output-based goals quietly tell teams how to do their job. "Use AI more" is a how. "Make 100 outbound calls a day" is a how. "Ship 24 blog posts a quarter" is a how. The team stops thinking about whether the how is actually working, because the how is the goal now. They lose the ability to learn, to swap out a tactic that isn't paying off, to try a different approach. Output-based goals turn knowledge workers into operators of a fixed playbook, even when the playbook is wrong.
Outcome-based goals do the opposite. "Generate 1,000 qualified leads" lets the team figure out the right mix of channels. "Reduce cycle time by 30%" lets engineering decide whether AI tooling, better PR review, smaller batch sizes, or something else is the right lever. The goal stays fixed on the result. The team retains the autonomy to figure out the means.
Goodhart's law shows up every single time
There's a name for the deeper mechanism, and it's worth knowing: Goodhart's law. We've covered it in why you can't run a company on KPIs alone. The short version is that when a measure becomes a target, it ceases to be a good measure. The moment Uber turned token spend into a leaderboard ranking, it stopped being a useful signal of adoption and became a thing teams optimized. Same with Meta's Claudeonomics. Same with Amazon's usage dashboard. Same with the developer who shipped 37,000 lines a day.
This is why output-based goals are particularly dangerous. Outputs are easy to count, which makes them easy to game. The team finds the cheapest way to move the output, and the cheapest way is almost never the way that produces the outcome you wanted.
What good goal-setting looks like instead
If you're heading into a planning round and someone proposes "drive AI adoption" or "increase tool usage" or "ship more code" as a goal, here's what to do.
Ask: what's the outcome we actually want?
Not the activity. The result. For an AI rollout, the candidate outcomes are mostly obvious:
- Reduce cycle time from ticket to shipped PR
- Reduce engineer-hours per shipped feature
- Lower defect rate or reduce rollback incidents
- Cut code review turnaround time
- Reduce support load on engineers
- Ship more features per quarter without adding headcount
Pick the one or two that matter most to the business. Set the goal on those. The team will figure out whether AI tooling is the right way to get there, or whether it's some combination of AI and other changes.
Pressure-test every proposed goal with one question.
"If we hit this number perfectly and nothing else changed, would the business actually be better off?"
If the answer is yes, you've got an outcome. If the answer is no, you've got an output, and you need to keep digging until you find the outcome it was supposed to produce. "Use AI a lot" fails this test instantly. The business isn't better off just because token spend went up. The business is better off only if something downstream of that spend actually moved.
Make the activity an Initiative, not the goal.
"Roll out Claude Code to all engineers" is real work, and it deserves a place in the plan. But that's an Initiative, the how, not the what. It belongs underneath the outcome goal, as one of several possible ways to move that outcome. Sales orgs don't make "1,000 outbound calls" the goal; they make it part of the playbook that supports the actual goal, which is closed revenue. Engineering should do the same with AI tooling.
Watch for the leaderboard instinct.
If you find yourself drafting a dashboard that ranks teams by activity (usage, calls made, lines of code, demos run), you've recreated Uber's setup. Rank by the outcome instead. Better yet, don't rank at all and let teams report on what's working.
The lesson
The Uber story has a satisfying narrative arc because it ended in a cap, an apology, and a budget post-mortem. Most of the time, this mistake doesn't end so cleanly. Most of the time, the team just quietly hits the activity metric, the activity metric quietly fails to deliver any actual outcome, and the leadership team quietly moves on to the next initiative without learning anything. The output number wins, and the company loses.
Don't let that happen at your next planning round. Set goals on outcomes, not outputs. Let the team figure out the activities. Measure the results.
How Perdoo helps you keep goals outcome-focused
Perdoo is built around the distinction this article is about. Objectives and Key Results live at the outcome layer in Perdoo, while Initiatives live underneath them as the activities, projects, and tasks meant to move those outcomes. The software structurally separates the two, so a tool rollout, a campaign, or a call quota can never accidentally become the goal itself. It sits where it belongs, as the work in service of the result. If you want goal-setting that won't let an activity metric pose as your strategy, start for free or request a demo.
FAQ
Continue reading...





