The Leveraged YearsPractical Claude for senior professionals

Legal

A study says gen AI beat TAR in document review. Here is what to actually change.

A widely covered study found generative AI performed strongly against technology-assisted review. The useful question is not whether it is impressive, but where it is defensible and how you validate it.

The Leveraged Years · Briefings

6 min read · Legal · Updated June 2026

Key Takeaways

What the study found: a major study covered by Law.com Legaltech News on June 9, 2026, summarized as "Better Than TAR, Nearly Expert," found generative AI performed strongly against technology-assisted review, or TAR, in a complex document review.
What it does not say: "better than TAR" in one study is not "use it blindly everywhere." It is a strong signal that gen-AI review is becoming a serious option, not a finished verdict that retires older methods overnight.
What to change Monday: the move is not to swap tools in a panic. It is to learn where gen-AI review is defensible, how to validate it against a known standard, and what your disclosure posture should be before you rely on it.
The real takeaway: the technology is catching up to the experts faster than the process around it. The partner who wins is the one who can defend the method, not the one who simply has the newest tool.

The Leveraged Years Briefing. Permalink

What the study actually found

On June 9, 2026, Law.com Legaltech News covered a major study on generative AI in document review. The headline framing was blunt: "Better Than TAR, Nearly Expert." In a complex document review, generative AI performed strongly against technology-assisted review, the predictive coding approach that has been the standard for large scale review for over a decade. The reporting placed gen-AI's performance close to expert human review and ahead of TAR in that test.

For anyone who has run a large review, that is a real result. TAR earned its place because it works and because courts accept it. A study putting generative AI ahead of it in a complex matter is not noise. It is a marker that the ground is moving.

What it does not mean

Now the honest part, because the headline is louder than the conclusion. One strong study is not a license to rip out your review process and run everything through a chatbot.

"Better than TAR, nearly expert" describes a result in a particular review, under particular conditions. It tells you the ceiling is rising. It does not tell you that gen-AI review is the right call for your specific matter, your specific data, or a specific judge. The gap between "this performed well in a study" and "this is defensible in my case" is exactly the gap a careful partner is paid to manage.

So the correct reaction is not adoption or dismissal. It is to treat gen-AI review as a serious option that now deserves a real evaluation, run with the same rigor you would apply to any method you intend to defend.

It helps to remember how TAR itself became normal. It was not because a study impressed people. It was because, over years, lawyers learned to validate it, courts saw the validation, and a body of practice formed around how to defend it. Generative AI is earlier on that same path. A strong study is a step, not the destination. The firms that come out ahead will be the ones that build the validation practice now, while the tool is improving, rather than the ones that either ignore it or trust it blindly. Both of those shortcuts skip the part that actually mattered for TAR.

Where gen-AI review is defensible

Defensibility in review has never been about the tool's brilliance. It is about whether you can show that your process found what it was supposed to find. That standard does not change because the engine changed.

Gen-AI review is most defensible where you can:

Define the task clearly. The clearer your description of what is responsive or relevant, the better these tools perform and the easier the result is to explain.
Measure the result against a known standard. A method is defensible when you can show its recall and precision on a validated sample, not when you simply trust it.
Reproduce and explain the process. If you cannot describe how the review was run and why, you cannot defend it, no matter which tool produced the result.

Where those conditions are weak, the case for gen-AI review weakens with them, regardless of what a study found in a cleaner setting.

Notice that none of those three conditions mention the tool. That is deliberate. A defensible review is a property of your process, not of the software you ran it through. The study tells you the tool can clear a high bar. Whether your review clears it depends on choices you make: how well you framed the task, how honestly you measured, how clearly you can explain the result later. A weak process with a strong tool is still a weak process, and a complex matter is exactly where that weakness surfaces under scrutiny.

How to validate it before you rely on it

This is the part a partner should actually own. Validation is not a formality you do after the fact. It is the thing that turns an impressive tool into a defensible result.

The core move is the same one that made TAR acceptable: test against a known answer. Build or use a validation sample where you know the correct classifications, run the gen-AI review against it, and measure how well it did. If recall and precision hold up on the sample, you have evidence. If they do not, you have caught the problem before it reached a court instead of after.

Treat the model's confidence as a starting point, never the proof. The model can be sure and wrong. Your validation sample is what tells the truth, and your documentation of that validation is what you hand to opposing counsel or the bench when the method is questioned.

There is a partner-level discipline buried in this that is easy to skip. Decide your acceptable recall and precision before you run the validation, not after you see the numbers. If you set the bar after the fact, you will be tempted to call whatever you got "good enough," and that is exactly the reasoning that falls apart under cross-examination. Set the standard first. If the review clears it, you have a defensible result and a clean story. If it does not, you have learned something important cheaply, before it cost a client, and you can adjust the task definition or the method and run it again. Validation that can only ever pass is not validation. The willingness to be told no is what gives the yes its weight.

For the broader split between using AI to find and read versus using it to produce work, that is a separate and useful distinction. We cover it in AI legal research vs drafting, because the validation mindset here mirrors the verification mindset there.

The disclosure posture to decide in advance

The question that catches firms off guard is not technical. It is "did you tell anyone." Disclosure of your review method can come up with opposing counsel, in meet and confer, or with the court, and the worst time to form a position is mid-dispute.

Decide your posture before you start. Know how you would describe the method if asked, what you are prepared to share about your validation, and where your duty of candor lands for the matter. You do not have to volunteer every detail in every case, but you should never be improvising the answer under pressure. A method you can describe calmly is a method you can defend. If you are the senior person setting that posture for a team, the senior lawyer AI operating model covers how to make it consistent across matters.

The skill under the tool

A new study will land every few months, and each one will push some tool ahead of the last. None of that changes the thing that actually wins cases: a process you can explain, validate, and defend. The tool is the same for everyone who buys it. The judgment about where it belongs, and the discipline to prove it worked, is not.

That is the part worth building, because it outlasts any single study or product. The Leveraged Attorney teaches that method for using AI on real legal work without losing the defensibility that the work depends on, and the two minute course quiz will point you to the right starting place.

Frequently Asked Questions

Does this study mean I should stop using TAR?

No. A major study covered by Law.com Legaltech News on June 9, 2026, found generative AI performed strongly against TAR and close to expert review in a complex matter. That is a strong signal, not a verdict that retires TAR. TAR is still accepted and effective. The move is to evaluate gen-AI review seriously for the right matters, not to swap methods reflexively.

How do I make a gen-AI review defensible?

The same way TAR became defensible: validate against a known standard. Test the review on a sample where you know the correct classifications, measure recall and precision, and document the result. Defensibility comes from a process you can explain and prove, not from the tool's confidence or a headline about its performance.

Do I have to disclose that I used AI for review?

It depends on the matter, the rules in your jurisdiction, and what opposing counsel or the court asks. The practical advice is to decide your disclosure posture before you start, so you can describe your method and your validation calmly if it comes up, rather than improvising under pressure. Confirm the specific obligation for your case with a qualified professional.

Is this briefing legal advice?

No. The Leveraged Years is an education company, not a law firm. This is a plain language explainer of a single study and the practical questions around it, and the law and the technology are both moving quickly. Treat it as background, and confirm anything that affects a live matter with a qualified professional.