anyECG-chat: A Generalist ECG-MLLM for Flexible ECG Input and Multi-Task Understanding
Journal:
arXiv
Published Date:
Jun 1, 2025
Abstract
The advent of multimodal large language models (MLLMs) has sparked interest
in their application to electrocardiogram (ECG) analysis. However, existing
ECG-focused MLLMs primarily focus on report generation tasks, often limited to
single 12-lead, short-duration (10s) ECG inputs, thereby underutilizing the
potential of MLLMs. To this end, we aim to develop a MLLM for ECG analysis that
supports a broader range of tasks and more flexible ECG inputs. However,
existing ECG-QA datasets are often monotonous. To address this gap, we first
constructed the anyECG dataset, which encompasses a wide variety of tasks,
including report generation, abnormal waveform localization, and open-ended
question answering. In addition to standard hospital ECGs, we introduced
long-duration reduced-lead ECGs for home environments and multiple ECG
comparison scenarios commonly encountered in clinical practice. Furthermore, we
propose the anyECG-chat model, which supports dynamic-length ECG inputs and
multiple ECG inputs. We trained the model using a three-stage curriculum
training recipe with the anyECG dataset. A comprehensive evaluation was
conducted, demonstrating that anyECG-chat is capable of supporting various
practical application scenarios, including not only common report generation
tasks but also abnormal waveform localization for long-duration reduced-lead
ECGs in home environments and comprehensive comparative analysis of multiple
ECGs.