Adaptive Modality Balanced Online Knowledge Distillation for Brain-Eye-Computer based Dim Object Detection
Journal:
arXiv
Published Date:
Jul 2, 2024
Abstract
Advanced cognition can be extracted from the human brain using brain-computer
interfaces. Integrating these interfaces with computer vision techniques, which
possess efficient feature extraction capabilities, can achieve more robust and
accurate detection of dim targets in aerial images. However, existing target
detection methods primarily concentrate on homogeneous data, lacking efficient
and versatile processing capabilities for heterogeneous multimodal data. In
this paper, we first build a brain-eye-computer based object detection system
for aerial images under few-shot conditions. This system detects suspicious
targets using region proposal networks, evokes the event-related potential
(ERP) signal in electroencephalogram (EEG) through the eye-tracking-based slow
serial visual presentation (ESSVP) paradigm, and constructs the EEG-image data
pairs with eye movement data. Then, an adaptive modality balanced online
knowledge distillation (AMBOKD) method is proposed to recognize dim objects
with the EEG-image data. AMBOKD fuses EEG and image features using a multi-head
attention module, establishing a new modality with comprehensive features. To
enhance the performance and robust capability of the fusion modality,
simultaneous training and mutual learning between modalities are enabled by
end-to-end online knowledge distillation. During the learning process, an
adaptive modality balancing module is proposed to ensure multimodal equilibrium
by dynamically adjusting the weights of the importance and the training
gradients across various modalities. The effectiveness and superiority of our
method are demonstrated by comparing it with existing state-of-the-art methods.
Additionally, experiments conducted on public datasets and system validations
in real-world scenarios demonstrate the reliability and practicality of the
proposed system and the designed method.