PulmoFusion: Advancing Pulmonary Health with Efficient Multi-Modal Fusion
Journal:
arXiv
Published Date:
Jan 29, 2025
Abstract
Traditional remote spirometry lacks the precision required for effective
pulmonary monitoring. We present a novel, non-invasive approach using
multimodal predictive models that integrate RGB or thermal video data with
patient metadata. Our method leverages energy-efficient Spiking Neural Networks
(SNNs) for the regression of Peak Expiratory Flow (PEF) and classification of
Forced Expiratory Volume (FEV1) and Forced Vital Capacity (FVC), using
lightweight CNNs to overcome SNN limitations in regression tasks. Multimodal
data integration is improved with a Multi-Head Attention Layer, and we employ
K-Fold validation and ensemble learning to boost robustness. Using thermal
data, our SNN models achieve 92% accuracy on a breathing-cycle basis and 99.5%
patient-wise. PEF regression models attain Relative RMSEs of 0.11 (thermal) and
0.26 (RGB), with an MAE of 4.52% for FEV1/FVC predictions, establishing
state-of-the-art performance. Code and dataset can be found on
https://github.com/ahmed-sharshar/RespiroDynamics.git