Educational Measurement - AI Medical Compendium

Assessing the performance of ChatGPT-4o on the Turkish Orthopedics and Traumatology Board Examination.

Joint diseases and related surgery Apr 5, 2025

OBJECTIVES: This study aims to assess the overall performance of ChatGPT version 4-omni (GPT-4o) on the Turkish Orthopedics and Traumatology Board Examination (TOTBE) using actual examinees as a reference point to evaluate and compare the performance...

Education, Medical, Graduate Specialty Boards Orthopedics Clinical Competence Generative Artificial Intelligence Humans Educational Measurement Turkey Traumatology

View on PubMed DOI

The performance of ChatGPT and ERNIE Bot in surgical resident examinations.

International journal of medical informatics Apr 4, 2025

STUDY PURPOSE: To assess the application of these two large language models (LLMs) for surgical resident examinations and to compare the performance of these LLMs with that of human residents.

Internship and Residency Language China Generative Artificial Intelligence Humans General Surgery Educational Measurement

View on PubMed DOI

Using a Hybrid of AI and Template-Based Method in Automatic Item Generation to Create Multiple-Choice Questions in Medical Education: Hybrid AIG.

JMIR formative research Apr 4, 2025

BACKGROUND: Template-based automatic item generation (AIG) is more efficient than traditional item writing but it still heavily relies on expert effort in model development. While nontemplate-based AIG, leveraging artificial intelligence (AI), offers...

Humans Artificial Intelligence Education, Medical Adult Educational Measurement

View on PubMed DOI

Semantic Clinical Artificial Intelligence vs Native Large Language Model Performance on the USMLE.

JAMA network open Apr 1, 2025

IMPORTANCE: Large language models (LLMs) are being implemented in health care. Enhanced accuracy and methods to maintain accuracy over time are needed to maximize LLM benefits.

Educational Measurement United States Semantics Artificial Intelligence Language Comparative Effectiveness Research Large Language Models Licensure, Medical Humans

View on PubMed DOI

Evaluating the value of AI-generated questions for USMLE step 1 preparation: A study using ChatGPT-3.5.

Medical teacher Mar 27, 2025

PURPOSE: Students are increasingly relying on artificial intelligence (AI) for medical education and exam preparation. However, the factual accuracy and content distribution of AI-generated exam questions for self-assessment have not been systematica...

Artificial Intelligence Students, Medical Educational Measurement Humans Education, Medical, Undergraduate Licensure, Medical Generative Artificial Intelligence Self-Assessment United States

View on PubMed DOI

Accuracy of LLMs in medical education: evidence from a concordance test with medical teacher.

BMC medical education Mar 26, 2025

BACKGROUND: There is an unprecedented increase in the use of Generative AI in medical education. There is a need to assess these models' accuracy to ensure patient safety. This study assesses the accuracy of ChatGPT, Gemini, and Copilot in answering ...

Education, Medical Faculty, Medical Humans Educational Measurement Clinical Competence United States

View on PubMed DOI

Accuracy and quality of ChatGPT-4o and Google Gemini performance on image-based neurosurgery board questions.

Neurosurgical review Mar 25, 2025

Large-language models (LLMs) have shown the capability to effectively answer medical board examination questions. However, their ability to answer imagebased questions has not been examined. This study sought to evaluate the performance of two LLMs (...

Neurosurgery Internship and Residency Clinical Competence Humans Educational Measurement Neurosurgical Procedures

View on PubMed DOI

Performance of Plug-In Augmented ChatGPT and Its Ability to Quantify Uncertainty: Simulation Study on the German Medical Board Examination.

JMIR medical education Mar 21, 2025

BACKGROUND: The GPT-4 is a large language model (LLM) trained and fine-tuned on an extensive dataset. After the public release of its predecessor in November 2022, the use of LLMs has seen a significant spike in interest, and a multitude of potential...

Educational Measurement Germany Uncertainty Specialty Boards Humans Clinical Competence

View on PubMed DOI

A tutorial activity for students to experience generative artificial intelligence: students' perceptions and actions.

Advances in physiology education Mar 19, 2025

Freely accessible generative artificial intelligence (GenAI) poses challenges to physiology education regarding learning and academic integrity. Although many studies have explored the capabilities of GenAI to complete assessments, few have implement...

Humans Educational Measurement Generative Artificial Intelligence Students, Medical Perception Male Artificial Intelligence Problem-Based Learning Female Students Physiology

View on PubMed DOI

Using aggregated AI detector outcomes to eliminate false positives in STEM-student writing.

Advances in physiology education Mar 19, 2025

Generative artificial intelligence (AI) large language models have become sufficiently accessible and user-friendly to assist students with course work, studying tactics, and written communication. AI-generated writing is almost indistinguishable fro...

Male Students False Positive Reactions Educational Measurement Writing Female Humans Artificial Intelligence

View on PubMed DOI

AIMC Topic: Educational Measurement

Assessing the performance of ChatGPT-4o on the Turkish Orthopedics and Traumatology Board Examination.

The performance of ChatGPT and ERNIE Bot in surgical resident examinations.

Using a Hybrid of AI and Template-Based Method in Automatic Item Generation to Create Multiple-Choice Questions in Medical Education: Hybrid AIG.

Semantic Clinical Artificial Intelligence vs Native Large Language Model Performance on the USMLE.

Evaluating the value of AI-generated questions for USMLE step 1 preparation: A study using ChatGPT-3.5.

Accuracy of LLMs in medical education: evidence from a concordance test with medical teacher.

Accuracy and quality of ChatGPT-4o and Google Gemini performance on image-based neurosurgery board questions.

Performance of Plug-In Augmented ChatGPT and Its Ability to Quantify Uncertainty: Simulation Study on the German Medical Board Examination.

A tutorial activity for students to experience generative artificial intelligence: students' perceptions and actions.

Using aggregated AI detector outcomes to eliminate false positives in STEM-student writing.

Popular Topics

Recent Journals

AIMC Topic: Educational Measurement

Stay Ahead of Medical AI

Popular Topics

Recent Journals