AOR: Anatomical Ontology-Guided Reasoning for Medical Large Multimodal Model in Chest X-Ray Interpretation
Journal:
arXiv
Published Date:
May 5, 2025
Abstract
Chest X-rays (CXRs) are the most frequently performed imaging examinations in
clinical settings. Recent advancements in Large Multimodal Models (LMMs) have
enabled automated CXR interpretation, enhancing diagnostic accuracy and
efficiency. However, despite their strong visual understanding, current Medical
LMMs (MLMMs) still face two major challenges: (1) Insufficient region-level
understanding and interaction, and (2) Limited accuracy and interpretability
due to single-step reasoning. In this paper, we empower MLMMs with
anatomy-centric reasoning capabilities to enhance their interactivity and
explainability. Specifically, we first propose an Anatomical Ontology-Guided
Reasoning (AOR) framework, which centers on cross-modal region-level
information to facilitate multi-step reasoning. Next, under the guidance of
expert physicians, we develop AOR-Instruction, a large instruction dataset for
MLMMs training. Our experiments demonstrate AOR's superior performance in both
VQA and report generation tasks.