Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy.

Journal: Computers in biology and medicine

Published Date: May 1, 2025

Abstract

This study evaluated the potential of multimodal AI chatbots, specifically ChatGPT-4o, in assessing thyroid-associated ophthalmopathy (TAO) through the Clinical Activity Score (CAS). Using publicly available case reports and datasets, ChatGPT-4o was tasked with generating a web-based CAS calculator and estimating CAS from external ocular photographs. Its predictions were compared with CAS evaluations by ophthalmologists and convolutional neural network (CNN) models, including ResNet50. Receiver operating characteristic (ROC) areas under the curve (AUCs) were calculated for the assessment of active TAO (CAS ≥3). ChatGPT-4o demonstrated high accuracy, with mean absolute errors of 0.39 and 0.45 compared to reference ophthalmologist scores across two datasets, outperforming both Gemini Advanced and ResNet50 in identifying active TAO. In the preoperative and pre-treatment datasets, ChatGPT-4o achieved ROC-AUCs of 0.974 and 0.990, respectively, significantly exceeding the performance of ResNet50 (0.770 and 0.623). Both ChatGPT-4o and Customized GPTs achieved identical results, suggesting robust performance without the need for further customization. The AI chatbot effectively processed both text- and image-based inputs, providing detailed explanations for its CAS estimates and creating a user-friendly calculator for rapid and accessible TAO evaluation. ChatGPT-4o thus can offer a reliable tool for TAO assessment, outperforming traditional CNN-based models. Its ability to generate a CAS calculator without prior training or coding expertise highlights its practical utility for clinical ophthalmology. This study's limitations included a small sample size, lack of real-world validation, reliance on photos without patient metadata, and challenges in repeatability. Future studies should aim to validate its effectiveness in real-world clinical settings.

Authors

Bo Ram Kim

Interdisciplinary Program in Senior Human Ecology, Changwon National University, Changwon, 51140 Korea.
Joon Yul Choi

Department of Electrical and Computer Engineering, Seoul National University, Seoul, South Korea.
Tae Keun Yoo

Keywords

Databases, Factual Graves Ophthalmopathy Humans Large Language Models Neural Networks, Computer ROC Curve

External Resources

View on PubMed Access via DOI PubMed (40315721)

Multimodal large language models as assistance for evaluation of thyroid-associated ophthalmopathy.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals