DAMPER: A Dual-Stage Medical Report Generation Framework with Coarse-Grained MeSH Alignment and Fine-Grained Hypergraph Matching
Journal:
arXiv
Published Date:
Dec 19, 2024
Abstract
Medical report generation is crucial for clinical diagnosis and patient
management, summarizing diagnoses and recommendations based on medical imaging.
However, existing work often overlook the clinical pipeline involved in report
writing, where physicians typically conduct an initial quick review followed by
a detailed examination. Moreover, current alignment methods may lead to
misaligned relationships. To address these issues, we propose DAMPER, a
dual-stage framework for medical report generation that mimics the clinical
pipeline of report writing in two stages. In the first stage, a MeSH-Guided
Coarse-Grained Alignment (MCG) stage that aligns chest X-ray (CXR) image
features with medical subject headings (MeSH) features to generate a rough
keyphrase representation of the overall impression. In the second stage, a
Hypergraph-Enhanced Fine-Grained Alignment (HFG) stage that constructs
hypergraphs for image patches and report annotations, modeling high-order
relationships within each modality and performing hypergraph matching to
capture semantic correlations between image regions and textual phrases.
Finally,the coarse-grained visual features, generated MeSH representations, and
visual hypergraph features are fed into a report decoder to produce the final
medical report. Extensive experiments on public datasets demonstrate the
effectiveness of DAMPER in generating comprehensive and accurate medical
reports, outperforming state-of-the-art methods across various evaluation
metrics.