McKinsey runs 25,000 AI agents. Here is the boutique reply.
The biggest firm in consulting turned the analyst pyramid into an agent pyramid, and it now lets AI help decide who staffs your engagement. A 2 or 3 person boutique can copy the delivery math and bid against the team at a fraction of the cost.
Part of our AI Regulation and Compliance News series
Key Takeaways
- The pyramid became an agent pyramid: Bloomberg reported on May 1, 2026 that McKinsey will use AI to help decide which consultants staff client engagements, and the firm now runs about 25,000 AI agents across its work. The leverage model that powered big firms for decades is being rebuilt on software.
- The threat and the opening are the same fact: if a large firm can deliver with fewer people plus agents, so can you. A solo or small boutique that copies the delivery math can credibly bid the same scope a junior team used to own, without the overhead the big firm has to recover.
- Your stack is a workflow, not a tool list: the work is to wire intake, research, drafting, review, and client updates into one agent-assisted pipeline you supervise, so two or three people produce what used to take a staffed team.
- Price the compression, do not discount it: the value is the same and the speed is higher, so the move is to charge for the outcome and the turnaround, not to bill fewer hours at a lower rate. The relationship and the judgment are exactly what the big firm is now automating away, and that is your defensible ground.
The Leveraged Years Briefing. Permalink
What McKinsey actually did, and why it matters to an independent
On May 1, 2026, Bloomberg reported that McKinsey plans to use AI agents to help choose which consultants get assigned to client engagements. For decades that matching was done by human staffing teams who knew the people and the projects. Now software helps make the call. The same reporting noted that McKinsey runs roughly 25,000 AI agents across client and back-office work, and has cut some review cycles by about a quarter.
Read past the headline and you see the real shift. The consulting business was always a leverage business. A few senior partners sold the work, and a pyramid of analysts and associates did the building underneath them. That pyramid is what let a firm staff a large engagement and bill for the team. What McKinsey is doing now is rebuilding the bottom of that pyramid out of agents, and using AI to manage who sits where on top of it.
If you are an independent consultant or a fractional executive, the instinct is to read that as a threat, and part of it is. The other part is the most useful gift the big firms have handed a small operator in years. They just published the playbook. The unit of delivery is shrinking from a staffed team to a handful of people plus a fleet of agents. You do not need 25,000 agents to use that idea. You need a stack of your own.
The threat is real. The opening is bigger.
Start with the honest version of the threat. When a large firm can deliver a workstream with fewer juniors and more software, its cost to produce a deliverable drops. That makes the mid-market engagements you used to win on price and speed more contested, because the firm can now reach down into work that was not worth its overhead before.
Now turn it over. The big firm still carries the big firm's cost structure. Brand, real estate, partner economics, and the machinery of selling all sit on top of every invoice. You do not carry any of that. If you build an agent-assisted delivery pipeline that compresses the same work, you get the productivity gain without the overhead the firm has to recover. A boutique of two or three people with a real stack can credibly bid a scope that used to require a staffed junior team, and price it as a partner-led engagement rather than a discount.
The opening, put plainly: the client buying from a big firm is paying for a team they will mostly never meet, managed by a partner they rarely see. The client buying from you gets the senior person doing the senior work, with agents doing the assembly underneath. For a lot of buyers that is not a downgrade. It is the thing they actually wanted.
How to build your own agent-augmented delivery stack
A delivery stack is not a pile of subscriptions. It is a defined pipeline that takes a piece of client work from intake to finished deliverable, with an AI assistant doing the heavy assembly at each stage and you supervising the output. Here is a concrete version a small consultant can stand up.
1. Intake and scoping. When a new engagement or workstream lands, the first job is to turn a messy brief into a structured plan. Feed the client materials, the call notes, and the goal to an assistant and have it produce a draft scope, a question list, and a work breakdown. You edit it in minutes instead of building it from a blank page. This is the stage McKinsey just automated at the firm level. You are doing the same thing at the desk level.
2. Research and synthesis. The associate's old job was reading everything and pulling out what mattered. An assistant does the first pass: summarize the documents, pull the relevant figures, surface the contradictions, draft the market or competitor picture. You direct it and check it. The reading that used to eat a junior's week is the part that compresses the most.
3. Drafting the deliverable. Models and memos and board decks start as structured drafts, not blank documents. Give the assistant your outline, your synthesis, and your point of view, and have it produce the first full draft in your format. Your time goes into the argument and the judgment, not the typing and the formatting.
4. Review and quality control. Use a second pass as a critic. Have the assistant pressure test the logic, flag weak claims, check the numbers tie out, and list what a skeptical client will push back on. This is the layer that catches the mistakes a tired solo operator misses at 11pm, and it is the layer you must never skip.
5. Client updates and follow-through. The status note, the recap, the next-steps email, the meeting summary. This is real work that quietly consumes a delivery week, and it is the easiest to template and assist. A standing routine that drafts your client updates keeps the relationship warm without taking the hours you need for the thinking.
Notice what this is. It is the analyst pyramid, rebuilt for one desk. Intake, research, drafting, review, and communication were the five things a junior team did. An agent-assisted pipeline does the assembly in each, and you do the part that requires a senior brain. If you want the structured build, that is exactly what The Leveraged Consultant is for, and the evergreen mechanics live in the boutique consultant AI delivery system briefing.
What this does to your pricing and your leverage
Here is where most independents get it wrong. They build a faster pipeline and then quietly pass the savings to the client by billing fewer hours. That is the discount trap, and it gets worse every time you get faster, because the better your stack the less you earn for the same outcome.
The deliverable is worth what it is worth to the client whether it took you a week or two days. So price the work, not the time. Quote the engagement on the outcome and the turnaround, and let the compression be your margin and your speed advantage rather than a reason to charge less. A boutique that delivers a partner-grade work product in days, at a price below a staffed firm but well above a freelancer, is sitting in the most defensible position in the market. This is a pricing problem worth thinking through on its own, which is the whole subject of repricing consulting after the billable hour.
The deeper point is about leverage. For a small firm, leverage used to mean hiring people, which meant managing them, paying them between projects, and hoping the pipeline stayed full enough to keep them busy. An agent stack gives you a different kind of leverage. Your capacity scales with the work without the fixed cost and the management drag of a payroll. That is the same advantage McKinsey is chasing with 25,000 agents, available to a firm of one. Winning the work that fills the stack is a separate discipline, covered in AI for business development for consultants.
What AI does not replace
The McKinsey story has a detail that is easy to miss and important to keep. The thing they are automating is the staffing decision, the matching of people to clients. The thing no agent does is be the trusted person in the room. That distinction is your map.
Agents are good at assembly. They draft, summarize, format, and check. They are not good at the judgment calls that define senior consulting: reading the politics in a client's organization, knowing which recommendation the board can actually act on, deciding what to leave out, and carrying the relationship through a hard quarter. A client does not hire a boutique for typing speed. They hire it for the judgment of the person whose name is on the work.
So the stack has a hard rule. The agent does the assembly. You own the thinking, the final review, and the relationship. The moment you let the pipeline ship work you have not actually read and stood behind, you have traded the one thing that makes a boutique worth more than a big firm's junior team. The senior judgment and the client trust are not the parts you automate. They are the product. Everything else is just faster scaffolding underneath it. For the broader habit of staying accountable for AI output, see the framework for productizing your expertise.
The move this week
You do not need to match the firm. You need a working pipeline for one real engagement. Pick a current piece of client work and run it through the five stages on purpose: draft the scope with an assistant, let it do the research first pass, generate the deliverable draft, run a critic pass, and template your client updates. Time it against how you did it last quarter. The gap you find is your new pricing power and your new capacity, and it compounds with every engagement you put through the same stack.
The big firms just told the whole market where consulting is going. The independents who win the next few years are the ones who build the stack instead of fearing it. If you want the full operating system for a boutique practice, The Leveraged Consultant teaches it directly, and the two minute course quiz will point you to the right program for your work.
Frequently Asked Questions
How can a 2 or 3 person consulting boutique compete with a firm running 25,000 AI agents?
By copying the delivery math, not the headcount. McKinsey reported about 25,000 AI agents and AI assisted staffing in May 2026, which means the unit of delivery is shrinking from a staffed team to a few people plus agents. A boutique builds its own agent assisted pipeline for intake, research, drafting, review, and client updates, then delivers a partner grade work product without the brand, real estate, and partner overhead a big firm has to recover. The boutique gets the productivity gain at a far lower cost base.
What does an agent augmented delivery stack actually include?
It is a defined pipeline, not a list of apps. Five stages: intake and scoping that turns a messy brief into a structured plan, research and synthesis that does the first reading pass, drafting that produces the deliverable in your format, a review pass that acts as a critic and pressure tests the work, and client updates that draft your recaps and next steps. An AI assistant does the assembly at each stage and the consultant supervises and owns the output.
If AI makes delivery faster, should an independent consultant lower their fees?
No. The deliverable is worth what it is worth to the client whether it took a week or two days, so price the outcome and the turnaround rather than billing fewer hours. Passing the speed gain to the client as a discount is a trap that gets worse as your stack gets better. Let the compression be your margin and your speed advantage, and keep the senior judgment and client relationship, which is exactly the part no agent replaces, as the reason your work commands a premium.