MedKA: A knowledge graph-augmented approach to improve factuality in medical Large Language Models.

Journal: Journal of biomedical informatics
Published Date:

Abstract

Large language models (LLMs) have demonstrated remarkable potential in medical applications. However, they still face critical challenges such as hallucinations, knowledge inconsistency, and insufficient integration of domain-specific medical expertise. To address these limitations, we introduce MedKA, a novel knowledge graph-augmented approach for fine-tuning and evaluating medical LLMs. Our approach systematically transforms structured knowledge from a medical knowledge graph into a high-quality QA corpus, cMKGQA, by clustering multiple fields around clinically meaningful scenarios (e.g., diagnosis, treatment planning). This grouping strategy enables comprehensive and use-case-specific data construction and supports one-stage training of the LLM, ensuring better alignment with structured medical knowledge. This transformation process ensures the comprehensive integration of domain-specific knowledge while maintaining factual consistency. To evaluate the factuality of LLM-generated response, we further propose the Knowledge Graph-based Auxiliary Evaluation Metrics (KG-AEMs)-a novel benchmarking framework that compares LLM outputs with fine-grained, attribute-level ground truth from knowledge graph. Experimental results demonstrate that MedKA achieves state-of-the-art performance, significantly outperforming existing models, including LLaMA-3.1-8B-Chinese-Chat, HuatuoGPT2-7B, and Apollo2-7B. On the cMKGQA dataset, MedKA achieves 44.63 BLEU-1 and 17.62 BLEU-4 scores, with particularly strong performance in areas such as medication recommendations and diagnostic tests as measured by KG-AEMs. Our approach highlights the potential of integrating knowledge graphs into LLM fine-tuning to improve the accuracy and reliability of medical AI systems. It advances factual accuracy in medical dialogue systems and provides a comprehensive framework for evaluating the integration of medical knowledge into LLMs. This work is publicly available on Github: https://github.com/Yai017/MedKA.

Authors

  • Yiyan Deng
    Department of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China.
  • Shen Zhao
    Department of Electrical and Computer Engineering, The Ohio State University.
  • Yongming Miao
    Department of Artificial Intelligence, Nanjing University of Information Science and Technology, Nanjing, Jiangsu, China; Jiangsu Key Laboratory of Intelligent Medical Image Computing, Nanjing, Jiangsu, China.
  • Junjie Zhu
    Hunan University; zhujunjie@hnu.edu.cn.
  • Jin Li
    Mental Health Center, West China Hospital, Sichuan University, Chengdu, China.