A multimodal deep learning architecture for predicting interstitial glucose for effective type 2 diabetes management.
Journal:
Scientific reports
Published Date:
Jul 29, 2025
Abstract
The accurate prediction of blood glucose is critical for the effective management of diabetes. Modern continuous glucose monitoring (CGM) technology enables real-time acquisition of interstitial glucose concentrations, which can be calibrated against blood glucose measurements. However, a key challenge in the effective management of type 2 diabetes lies in forecasting critical events driven by glucose variability. While recent advances in deep learning enable modeling of temporal patterns in glucose fluctuations, most of the existing methods rely on unimodal inputs and fail to account for individual physiological differences that influence interstitial glucose dynamics. These limitations highlight the need for multimodal approaches that integrate additional personalized physiological information. One of the primary reasons for multimodal approaches not being widely studied in this field is the bottleneck associated with the availability of subjects' health records. In this paper, we propose a multimodal approach trained on sequences of CGM values and enriched with physiological context derived from health records of 40 individuals with type 2 diabetes. The CGM time series were processed using a stacked Convolutional Neural Network (CNN) and a Bidirectional Long Short-Term Memory (BiLSTM) network followed by an attention mechanism. The BiLSTM learned long-term temporal dependencies, while the CNN captured local sequential features. Physiological heterogeneity was incorporated through a separate pipeline of neural networks that processed baseline health records and was later fused with the CGM modeling stream. To validate our model, we utilized CGM values of 30 min sampled with a moving window of 5 min to predict the CGM values with a prediction horizon of (a) 15 min, (b) 30 min, and (c) 60 min. We achieved the multimodal architecture prediction results with Mean Absolute Point Error (MAPE) between 14 and 24 mg/dL, 19-22 mg/dL, 25-26 mg/dL in case of Menarini sensor and 6-11 mg/dL, 9-14 mg/dL, 12-18 mg/dL in case of Abbot sensor for 15, 30 and 60 min prediction horizon respectively. The results suggested that the proposed multimodal model achieved higher prediction accuracy compared to unimodal approaches; with upto 96.7% prediction accuracy; supporting its potential as a generalizable solution for interstitial glucose prediction and personalized management in the type 2 diabetes population.