PAL: Prompting Analytic Learning with Missing Modality for Multi-Modal Class-Incremental Learning
Journal:
arXiv
Published Date:
Jan 16, 2025
Abstract
Multi-modal class-incremental learning (MMCIL) seeks to leverage multi-modal
data, such as audio-visual and image-text pairs, thereby enabling models to
learn continuously across a sequence of tasks while mitigating forgetting.
While existing studies primarily focus on the integration and utilization of
multi-modal information for MMCIL, a critical challenge remains: the issue of
missing modalities during incremental learning phases. This oversight can
exacerbate severe forgetting and significantly impair model performance. To
bridge this gap, we propose PAL, a novel exemplar-free framework tailored to
MMCIL under missing-modality scenarios. Concretely, we devise modality-specific
prompts to compensate for missing information, facilitating the model to
maintain a holistic representation of the data. On this foundation, we
reformulate the MMCIL problem into a Recursive Least-Squares task, delivering
an analytical linear solution. Building upon these, PAL not only alleviates the
inherent under-fitting limitation in analytic learning but also preserves the
holistic representation of missing-modality data, achieving superior
performance with less forgetting across various multi-modal incremental
scenarios. Extensive experiments demonstrate that PAL significantly outperforms
competitive methods across various datasets, including UPMC-Food101 and
N24News, showcasing its robustness towards modality absence and its
anti-forgetting ability to maintain high incremental accuracy.