AIMC Topic: Educational Measurement

Clear Filters Showing 1 to 10 of 227 articles

A comparative analysis of DeepSeek R1, DeepSeek-R1-Lite, OpenAi o1 Pro, and Grok 3 performance on ophthalmology board-style questions.

Scientific reports
The ability of large language models (LLMs) to accurately answer medical board-style questions reflects their potential to benefit medical education and real-time clinical decision-making. With the recent advance to reasoning models, the latest LLMs ...

Educational improvement through machine learning: Strategic models for better PISA scores.

PloS one
In this study, in addition to traditional variables such as economic wealth or the number of books read, on which many studies have already been conducted, variables that are thought to influence student achievement and better predict success are ide...

Enhancing intercultural competence in technical higher education through AI-driven frameworks.

Scientific reports
The assessment of Intercultural Competence (IC) is increasingly recognized as an essential component of students' professional competency development in higher education settings. This study looks at the objectives of creating ICC and offers an asses...

Research on learning achievement classification based on machine learning.

PloS one
Academic achievement is an important index to measure the quality of education and students' learning outcomes. Reasonable and accurate prediction of academic achievement can help improve teachers' educational methods. And it also provides correspond...

Assessment of Large Language Model Performance on Medical School Essay-Style Concept Appraisal Questions: Exploratory Study.

JMIR medical education
Bing Chat (subsequently renamed Microsoft Copilot)-a ChatGPT 4.0-based large language model-demonstrated comparable performance to medical students in answering essay-style concept appraisals, while assessors struggled to differentiate artificial int...

Performance of DeepSeek-R1 and ChatGPT-4o on the Chinese National Medical Licensing Examination: A Comparative Study.

Journal of medical systems
Large Language Models (LLMs) have a significant impact on medical education due to their advanced natural language processing capabilities. ChatGPT-4o (Chat Generative Pre-trained Transformer), a mainstream Western LLM, demonstrates powerful multimod...

Performance of GPT-4 in oral and maxillofacial surgery board exams: challenges in specialized questions.

Oral and maxillofacial surgery
PURPOSE: The aim of this study was to evaluate the performance of GPT-4 in answering oral and maxillofacial surgery (OMFS) board exam questions, given its success in other medical specializations.

Performance of single-agent and multi-agent language models in Spanish language medical competency exams.

BMC medical education
BACKGROUND: Large language models (LLMs) like GPT-4o have shown promise in advancing medical decision-making and education. However, their performance in Spanish-language medical contexts remains underexplored. This study evaluates the effectiveness ...

Is AI the future of evaluation in medical education?? AI vs. human evaluation in objective structured clinical examination.

BMC medical education
BACKGROUND: Objective Structured Clinical Examinations (OSCEs) are widely used in medical education to assess students' clinical and professional skills. Recent advancements in artificial intelligence (AI) offer opportunities to complement human eval...

AI-generated questions for urological competency assessment: a prospective educational study.

BMC medical education
BACKGROUND: The integration of artificial intelligence (AI) in medical education assessment remains largely unexplored, particularly in specialty-specific evaluations during clinical rotations. Traditional question development methods are time-intens...