An audit found every AI scribe made errors. Here is the 90 second check.
When Ontario tested 20 AI scribe systems, all 20 produced inaccuracies, including invented orders and wrong drugs. This is the fast read-back protocol that catches the three failure modes before you sign.
Key Takeaways
- The finding is stark: on May 12 to 14, 2026, Ontario's Auditor General reported that all 20 AI scribe systems it procurement-tested showed inaccuracies, despite roughly 5,000 Ontario physicians already using them.
- The three failure modes: 9 systems hallucinated, including fabricated referrals or blood-test orders the doctor never made; 12 captured the wrong drug; and 17 missed key mental-health details.
- This is a Canadian audit, not US law: treat it as a universal accuracy warning about how these tools fail, not a regulation you must follow.
- The fix is a 90 second read-back: a short, targeted review aimed squarely at wrong drug, phantom order, and dropped mental-health detail catches the exact errors the audit found before your signature goes on the note.
The Leveraged Years Briefing. Permalink
What the Ontario audit actually found
This one deserves attention because it was not a vendor demo or a press release. It was an independent government audit, and the result was blunt.
On May 12 to 14, 2026, Ontario's Auditor General reported on AI scribe systems used in the province's healthcare system. Across 20 systems put through procurement testing, every single one showed inaccuracies. Not most. All twenty. And this is not a fringe tool: roughly 5,000 Ontario physicians were already using these systems when the audit ran.
To be clear about scope, this is a Canadian audit. It is not US law and it does not bind your practice. But accuracy is not a jurisdiction. The way these tools fail in Ontario is the way they fail everywhere, because it is baked into how the underlying technology works. So read this as a warning about the tool category, not a foreign regulation you can ignore.
The three ways the notes went wrong
The audit did not just say errors happened. It found patterns, and the patterns are specific enough to defend against. Three failure modes stood out.
- Hallucinated content. Nine of the systems fabricated information, including referrals or blood-test orders the doctor never actually made. The note documented care that did not happen.
- Wrong drug. Twelve systems captured the wrong medication, which in a clinical note is among the most dangerous errors there is.
- Dropped mental-health detail. Seventeen systems missed key mental-health information, the kind of nuance that does not survive automated transcription well and that matters enormously to the patient.
Notice the shape of these. They are not random typos. They are confident, plausible-looking errors: an order that reads like a real order, a drug name that looks right, a clean note that is clean because something important fell out of it. That is exactly the kind of error a quick glance misses, which is why a glance is not enough.
There is a reason these three failure modes cluster the way they do. Drug names are short, similar-sounding, and spoken fast in a real visit, which is hard for any transcription system to get right. Orders and referrals are easy for a fluent model to invent because they fit the expected pattern of a note, so a plausible one slips in without a trigger in the audio. And mental-health detail is exactly the kind of nuanced, lightly-stated, easily-paraphrased content that gets smoothed away when a model compresses a conversation into a tidy note. Knowing why each one happens tells you precisely where to point your attention.
The 90 second read-back protocol
You do not need to re-document the visit to catch these. You need a fast, targeted read-back aimed at the three failure modes the audit found. Do it before you sign, every time. It runs in about 90 seconds.
- Check the medications, first and hardest. Read every drug name in the note against what you actually prescribed or discussed. This is where the 12-system wrong-drug failure lives. If a name looks even slightly off, fix it before anything else.
- Hunt for phantom orders. Scan the note for any referral, lab, imaging, or order, and confirm you actually made each one. The audit found 9 systems inventing orders and referrals. If you did not order it, delete it. A fabricated blood test or referral in the chart is your liability the moment you sign.
- Confirm the mental-health detail survived. If the visit touched mood, risk, psychiatric history, or anything in that category, check that it is captured accurately and completely. Seventeen systems dropped this kind of detail. If it is thin or missing, add it back in your own words.
Then read the note once, top to bottom, for anything that simply does not match your memory of the visit. Then sign.
Why this is different from the consent question
A boundary worth drawing. This protocol is about accuracy: making sure the note is true before you attest to it. That is a separate problem from whether your patient knows or agreed to the AI listening, which is consent. If consent is your concern, the briefing on when AI scribes need patient consent covers that directly. Here we are only catching errors. The companion briefing on how doctors use AI for clinical notes safely gives the broader safe-use habit that this 90 second check slots into.
The honest limit of these tools
The useful thing about the Ontario audit is that it kills a comfortable assumption: that the good scribes are accurate and only the cheap ones fail. All 20 failed. That means accuracy review is not a vendor-selection problem you can buy your way out of. It is a permanent part of using the category.
This is not an argument against AI scribes. They save real time, and the time saved is worth keeping. It is an argument that the time saved on drafting has to be partly reinvested in review. The 90 seconds you spend on the read-back is the cheapest insurance you will buy all day.
The skill under the tool
The pattern across every briefing in this series is the same. The tool gets faster, more fluent, more confident, and none of that makes it more accurate. What protects the patient and your license is a human who reads what the machine produced with a trained, skeptical eye and knows exactly where these tools tend to break.
That skill is learnable and it does not expire when the next scribe version ships. AI for Physician Notes builds the verify-before-you-sign workflow, including the targeted read-back this audit calls for, and the two minute course quiz will point you to the right starting place for your practice.
Frequently Asked Questions
How bad were the AI scribe errors in the Ontario audit?
Significant and universal. On May 12 to 14, 2026, Ontario's Auditor General reported that all 20 procurement-tested AI scribe systems showed inaccuracies. Specifically, 9 hallucinated content like fabricated referrals or blood-test orders, 12 captured the wrong drug, and 17 missed key mental-health details, even though roughly 5,000 Ontario physicians were already using them.
Does this Ontario audit apply to me if I practice in the US?
It is a Canadian audit, so it is not US law and does not bind your practice. But it is a credible, independent warning about how AI scribes fail as a tool category, and those failure modes are not specific to one country. Treat it as a universal accuracy warning worth acting on regardless of where you practice.
What is the fastest way to catch these errors before I sign?
A targeted read-back of about 90 seconds aimed at the three failure modes. Check every drug name against what you actually prescribed, scan for any referral or order and confirm you made it, and verify any mental-health detail is captured accurately. Then read the note once for anything that does not match the visit, and sign.
Is this briefing legal or medical advice?
No. The Leveraged Years is an education company, not a law or medical firm. This is a plain summary of an independent audit and a practical review habit, and tools and findings can change. Treat it as background, and confirm anything affecting your documentation, patient safety, or liability with a qualified professional.