AIMC Topic: Educational Measurement

Clear Filters Showing 261 to 270 of 311 articles

Evaluating Large Language Models on American Board of Anesthesiology-style Anesthesiology Questions: Accuracy, Domain Consistency, and Clinical Implications.

Journal of cardiothoracic and vascular anesthesia
Recent advances in large language models (LLMs) have led to growing interest in their potential applications in medical education and clinical practice. This study evaluated whether five widely used and highly developed LLMs-ChatGPT-4, Gemini, Claude...

LLM-Generated multiple choice practice quizzes for preclinical medical students.

Advances in physiology education
Multiple choice questions (MCQs) are frequently used in medical education for assessment. Automated generation of MCQs in board-exam format could potentially save significant effort for faculty and generate a wider set of practice materials for stude...

Comparison of a generative large language model to pharmacy student performance on therapeutics examinations.

Currents in pharmacy teaching & learning
OBJECTIVE: To compare the performance of a generative language model (ChatGPT-3.5) to pharmacy students on therapeutics examinations.

Performance of ChatGPT-4 Omni and Gemini 1.5 Pro on Ophthalmology-Related Questions in the Turkish Medical Specialty Exam.

Turkish journal of ophthalmology
OBJECTIVES: To evaluate the response and interpretative capabilities of two pioneering artificial intelligence (AI)-based large language model (LLM) platforms in addressing ophthalmology-related multiple-choice questions (MCQs) from Turkish Medical S...

Evaluating the Performance of Reasoning Large Language Models on Japanese Radiology Board Examination Questions.

Academic radiology
RATIONALE AND OBJECTIVES: This study evaluates the performance, cost, and processing time of OpenAI's reasoning large language models (LLMs) (o1-preview, o1-mini) and their base models (GPT-4o, GPT-4o-mini) on Japanese radiology board examination que...

AI in action: Changes to student perceptions when using generative artificial intelligence for the creation of a multimedia project-based assessment.

European journal of pharmacology
INTRODUCTION: New modes of assessments are needed to evaluate of the authenticity of student learning in an artificial intelligence (AI) world. In mid-2023, we piloted a new assessment type; a collaborative group multimedia assessment with AI allowan...

Artificial intelligence (AI) performance on pharmacy skills laboratory course assignments.

Currents in pharmacy teaching & learning
OBJECTIVE: To compare pharmacy student scores to scores of artificial intelligence (AI)-generated results of three common platforms on pharmacy skills laboratory assignments.

Comparison of ChatGPT-4, Copilot, Bard and Gemini Ultra on an Otolaryngology Question Bank.

Clinical otolaryngology : official journal of ENT-UK ; official journal of Netherlands Society for Oto-Rhino-Laryngology & Cervico-Facial Surgery
OBJECTIVE: To compare the performance of Google Bard, Microsoft Copilot, GPT-4 with vision (GPT-4) and Gemini Ultra on the OTO Chautauqua, a student-created, faculty-reviewed otolaryngology question bank.

Analysis of ChatGPT-4's performance on ophthalmology questions from the MIR exam.

Archivos de la Sociedad Espanola de Oftalmologia
PURPOSE: To evaluate the performance of ChatGPT in solving clinical scenarios in ophthalmology, specifically questions from the specialty exams for Resident Medical Interns (MIR).

Comparison of ChatGPT plus (version 4.0) and pretrained AI model (Orthopod) on orthopaedic in-training exam (OITE).

The surgeon : journal of the Royal Colleges of Surgeons of Edinburgh and Ireland
INTRODUCTION: Recent advancements in large language model (LLM) artificial intelligence (AI) systems, like ChatGPT, have showcased ability in answering standardized examination questions, but their performance is variable. The goal of this study was ...