A multimodal deep learning framework for enzyme turnover prediction with missing modality.
Journal:
Computers in biology and medicine
Published Date:
May 22, 2025
Abstract
Accurate prediction of the turnover number (k), which quantifies the maximum rate of substrate conversion at an enzyme's active site, is essential for assessing catalytic efficiency and understanding biochemical reaction mechanisms. Traditional wet-lab measurements of k are time-consuming and resource-intensive, making deep learning (DL) methods an appealing alternative. However, existing DL models often overlook the impact of reaction products on k due to feedback inhibition, resulting in suboptimal performance. The multimodal nature of this k prediction task, involving enzymes, substrates, and products as inputs, presents additional challenges when certain modalities are unavailable during inference due to incomplete data or experimental constraints, leading to the inapplicability of existing DL models. To address these limitations, we introduce MMKcat, a novel framework employing a prior-knowledge-guided missing modality training mechanism, which treats substrates and enzyme sequences as essential inputs while considering other modalities as maskable terms. Moreover, an innovative auxiliary regularizer is incorporated to encourage the learning of informative features from various modal combinations, enabling robust predictions even with incomplete multimodal inputs. We demonstrate the superior performance of MMKcat compared to state-of-the-art methods, including DLKcat, TurNup, UniKP, EITLEM-Kinetic, DLTKcat and GELKcat, using BRENDA and SABIO-RK. Our results show significant improvements under both complete and missing modality scenarios in RMSE, R, and SRCC metrics, with average improvements of 6.41%, 22.18%, and 8.15%, respectively. Codes are available at https://github.com/ProEcho1/MMKcat.