Impact of an Ambient AI Scribe on Medical Student Objective Structured Clinical Examination Notes: Nonrandomized Clinical Trial.
Journal:
JMIR medical education
Published Date:
Jun 2, 2026
Abstract
BACKGROUND: Ambient artificial intelligence (AI) scribes for chart documentation have seen rapid adoption in clinical practice, but their educational impact on medical students has not been described. OBJECTIVE: The purpose of this study was to determine the impact of an AI scribe on preclerkship medical student note writing. METHODS: In this prospective nonrandomized pretest-posttest study, all first-year medical students (N=104) at a single US medical school submitted "human-only" notes based on a summative objective structured clinical examination station in May 2025. An AI scribe generated independent AI notes after the objective structured clinical examination from recorded audio. A subgroup of students (n=47) consented to complete a second "hybrid" note by revisiting their human-only notes and incorporating AI notes as perceived necessary, followed by a brief survey about the AI notes. Trained, blinded fourth-year medical student raters were randomly assigned to score all notes on 10 elements using QNOTE acceptability criteria (0="unacceptable," 50="partially acceptable," and 100="fully acceptable"). A post hoc, exploratory element-level review was then conducted. RESULTS: Across all elements, median evaluation scores of human-only notes were high (range 81.3-100) and similar between students who submitted "hybrid" notes and those who did not. In paired analyses between "human-only" and "hybrid" notes, the only notable element-level change was a decline in "chief complaint" scores (P=.05). Symptom duration was mentioned in the chief complaint section in 17% (8/47) of the AI notes. No score differences were observed in QNOTE elements requiring documentation of pertinent findings and prioritized lists. Participants agreed that the AI note "was more concise than my note" (37/47, 78.7%) and would be "helpful as a first draft" (31/47, 66%); 55.3% (26/47) agreed that the AI note "left out important details," and 21.3% (10/47) agreed that the AI note "may reduce my ability to learn how to write a good note." CONCLUSIONS: Interaction with AI notes among preclerkship medical students had little impact on the quality of "hybrid" notes. Chief complaint scores likely declined due to conciseness in AI notes that often omitted symptom duration. Our findings suggest that, among students who predominantly wrote close to fully acceptable "human-only" notes, there was no detriment to clinical reasoning, and students were discerning in balancing AI's conciseness and its omissions. The lack of impact on note quality may have been due to the workflow used in this study, in which students were required to generate independent judgments before exposure to AI-generated content. Future work must explore longitudinal use of such tools using standard workflows observed in clinical settings, where AI notes serve as true first drafts. AI scribes could enhance students' own note writing, especially for lower-performing students, although educational safeguards are necessary given the potential for harm due to overreliance on automated systems.
Authors
Keywords
No keywords available for this article.