Assessments of lung nodules by an artificial intelligence chatbot using longitudinal CT images.
Journal:
Cell reports. Medicine
PMID:
40043704
Abstract
Large language models have shown efficacy across multiple medical tasks. However, their value in the assessment of longitudinal follow-up computed tomography (CT) images of patients with lung nodules is unclear. In this study, we evaluate the ability of the latest generative pre-trained transformer (GPT)-4o model to assess changes in malignancy probability, size, and features of lung nodules on longitudinal CT scans from 647 patients (547 from two local centers and 100 from a public dataset). GPT-4o achieves an average accuracy of 0.88 in predicting lung nodule malignancy compared to pathological results and an average intraclass correlation coefficient of 0.91 in measuring nodule size compared with manual measurements by radiologists. Six radiologists' evaluations demonstrate GPT-4o's ability to capture changes in nodule features with a median Likert score of 4.17 (out of 5.00). In summary, GPT-4o could capture dynamic changes in lung nodules across longitudinal follow-up CT images, thus providing high-quality radiological evidence to assist in clinical management.