Explainable Artificial Intelligence (XAI) in the Era of Large Language Models: Applying an XAI Framework in Pediatric Ophthalmology Diagnosis using the Gemini Model.

Journal: AMIA Joint Summits on Translational Science proceedings. AMIA Joint Summits on Translational Science
Published Date:

Abstract

Amblyopia is a neurodevelopmental disorder affecting children's visual acuity, requiring early diagnosis for effective treatment. Traditional diagnostic methods rely on subjective evaluations of eye tracking recordings from high fidelity eye tracking instruments performed by specialized pediatric ophthalmologists, often unavailable in rural, low resource clinics. As such, there is an urgent need to develop a scalable, low cost, high accuracy approach to automatically analyze eye tracking recordings. Large Language Models (LLM) show promise in accurate detection of amblyopia; our prior work has shown that the Google Gemini model, guided by expert ophthalmologists, can detect control and amblyopic subjects from eye tracking recordings. However, there is a clear need to address the issues of transparency and trust in medical applications of LLMs. To bolster the reliability and interpretability of LLM analysis of eye tracking records, we developed a Feature Guided Interprative Prompting (FGIP) framework focused on critical clinical features. Using the Google Gemini model, we classify high-fidelity eye-tracking data to detect amblyopia in children and apply the Quantus framework to evaluate the classification results across key metrics (faithfulness, robustness, localization, and complexity). These metrics provide a quantitative basis for understanding the model's decision-making process. This work presents the first implementation of an Explainable Artificial Intelligence (XAI) framework to systematically characterize the results generated by the Gemini model using high-fidelity eye-tracking data to detect amblyopia in children. Results demonstrated that the model accurately classified control and amblyopic subjects, including those with nystagmus while maintaining transparency and clinical alignment. The results of this study support the development of a scalable and interpretable clinical decision support (CDS) tool using LLMs that has the potential to enhance the trustworthiness of AI applications.

Authors

  • Dipak P Upadhyaya
    Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA.
  • Katrina Prantzalos
    Department of Population and Quantitative Health Sciences, Case Western Reserve University, Cleveland, OH, USA.
  • Pedram Golnari
    Department of Population and Quantitative Health Sciences, Case Western Reserve University School of Medicine, Cleveland, OH, USA.
  • Aasef G Shaikh
    Department of Neurology, Case Western Reserve University, Cleveland, OH, United States; National VA Parkinson Consortium Center and Neurology Service, Louis Stokes Cleveland VA Medical Center, Cleveland, OH, United States; Neurological Institute, University Hospitals, Cleveland, OH, United States.
  • Subhashini Sivagnanam
    San Diego Supercomputer Center, University of California, San Diego, CA, USA.
  • Amitava Majumdar
    University of California, San Diego, San Diego, CA, USA.
  • Fatema F Ghasia
    Visual Neuroscience Laboratory, Cole Eye Institute, Cleveland Clinic, Cleveland, OH, USA.
  • Satya S Sahoo
    Division of Medical Informatics, School of Medicine, Case Western Reserve University, Cleveland, OH.

Keywords

No keywords available for this article.