Understanding Clinician Edits to Ambient AI Draft Notes: A Feasibility Analysis Using Large Language Models

Journal: medRxiv

Published Date: Mar 2, 2026

Abstract

Ambient AI documentation tools generate draft notes that clinicians can review and edit before signing off in electronic health records. Scalable computational approaches to characterize how clinicians modify drafts remain limited, yet are essential for evaluating and improving AI effectiveness. We examined the feasibility of a few-shot prompted large language model (LLM) for categorizing sentence-level edits between AI drafts and final documentation. We developed five label-specific binary models targeting medication, symptom, diagnosis, orders/tests/procedures, and social history edits, and refined prompts using adversarial negatives and verification gates. Evaluation was performed against a human-annotated corpus. Medication and symptom models achieved promising performance (F1=0.787 and 0.780), whereas remaining models were precision-limited. Errors clustered in long, complex edits and category-boundary ambiguity. Therefore, prompt engineering is reliable for categorizing edits with explicit clues, while for complex context-dependent categories they are better suited for triage by labeling edits for human review.

Authors

Guo
Y.; Zhou
Y.; Hu
D.; Sutari
S.; Chow
E.; Tam
S.; Perret
D.; Pandita
D.; Zheng
K.

External Resources

View on medRxiv Access via DOI

Understanding Clinician Edits to Ambient AI Draft Notes: A Feasibility Analysis Using Large Language Models

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Understanding Clinician Edits to Ambient AI Draft Notes: A Feasibility Analysis Using Large Language Models

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals