A Novel Evaluation Framework for Medical LLMs: Combining Fuzzy Logic and MCDM for Medical Relation and Clinical Concept Extraction.

Journal: Journal of medical systems
Published Date:

Abstract

Artificial intelligence (AI) has become a crucial element of modern technology, especially in the healthcare sector, which is apparent given the continuous development of large language models (LLMs), which are utilized in various domains, including medical beings. However, when it comes to using these LLMs for the medical domain, there's a need for an evaluation platform to determine their suitability and drive future development efforts. Towards that end, this study aims to address this concern by developing a comprehensive Multi-Criteria Decision Making (MCDM) approach that is specifically designed to evaluate medical LLMs. The success of AI, particularly LLMs, in the healthcare domain, depends on their efficacy, safety, and ethical compliance. Therefore, it is essential to have a robust evaluation framework for their integration into medical contexts. This study proposes using the Fuzzy-Weighted Zero-InConsistency (FWZIC) method extended to p, q-quasirung orthopair fuzzy set (p, q-QROFS) for weighing evaluation criteria. This extension enables the handling of uncertainties inherent in medical decision-making processes. The approach accommodates the imprecise and multifaceted nature of real-world medical data and criteria by incorporating fuzzy logic principles. The MultiAtributive Ideal-Real Comparative Analysis (MAIRCA) method is employed for the assessment of medical LLMs utilized in the case study of this research. The results of this research revealed that "Medical Relation Extraction" criteria with its sub-levels had more importance with (0.504) than "Clinical Concept Extraction" with (0.495). For the LLMs evaluated, out of 6 alternatives, ( ) "GatorTron S 10B" had the 1st rank as compared to ( ) "GatorTron 90B" had the 6th rank. The implications of this study extend beyond academic discourse, directly impacting healthcare practices and patient outcomes. The proposed framework can help healthcare professionals make more informed decisions regarding the adoption and utilization of LLMs in medical settings.

Authors

  • A H AlAmoodi
    Department of Computing, FSKIK, Universiti Pendidikan Sultan Idris, Tanjong Malim, Malaysia.
  • Omar Zughoul
    Information Systems and Computer Science Department, Ahmed bin Mohammed Military College, Al-Shahaniya, Qatar.
  • Dianese David
    Faculty of Computing and Meta-Technology (FKMT), Universiti Pendidikan Sultan Idris (UPSI), Perak, Malaysia.
  • Salem Garfan
    Faculty of Computing and Meta-Technology (FKMT), Universiti Pendidikan Sultan Idris (UPSI), Perak, Malaysia.
  • Dragan Pamučar
    Department of Logistics, University of Defence in Belgrade, Pavla Jurisica Sturma 33, 11000 Belgrade, Serbia.
  • O S Albahri
    Department of Computing, FSKIK, Universiti Pendidikan Sultan Idris, Tanjung Malim 35900, Malaysia.
  • A S Albahri
    Informatics Institute for Postgraduate Studies (IIPS), Iraqi Commission for Computers and Informatics (ICCI), Baghdad, Iraq.
  • Salman Yussof
    Institute of Informatics and Computing in Energy, Universiti Tenaga Nasional, Kajang, Malaysia.
  • Iman Mohamad Sharaf
    Department of Basic Sciences, Higher Technological Institute, Tenth of Ramadan City, Egypt.