protPheMut: An Interpretable Machine Learning Tool for Classification of Cancer and Neurodevelopmental Disorders in Human Missense Mutations.
Journal:
Journal of chemical information and modeling
Published Date:
Jul 28, 2025
Abstract
Recent advances in human genomics have revealed that missense mutations in a single protein can lead to distinctly different phenotypes. In particular, some mutations in oncoproteins like MEK1, MEK2, PI3Kα, PTEN, SHAP2, and RAS are linked various cancers and neurodevelopmental disorders (NDDs). While numerous tools exist for predicting the pathogenicity of missense mutations, linking these mutations to certain phenotypes remains a major challenge, particularly in the context of personalized medicine. To fill this gap, we developed protPheMut (Protein Phenotypic Mutations Analyzer, http://netprotlab.com/protPheMut), leveraging interpretable machine learning approaches and enhancing model transparency through SHAP explanations, to integrate diverse biophysical and network dynamics-based signatures for predicting whether mutations in the same protein promote cancer or NDDs. Overall, proPheMut achieved an AUCROC of 0.9118 in cross-validation and 0.8925 on an independent test set for discriminating cancer- versus NDDs-related mutations. We further illustrate its utility in phenotype (cancer/NDDs) prediction by mutation analyses of two protein cases, PI3Kα and PTEN. Compared to seven other predictive tools, protPheMut demonstrated exceptional accuracy in forecasting phenotypic effects, achieving an AUROC of 0.8501 for PI3Kα mutations related to cancer and Cowden syndrome. For multi-phenotype prediction of PTEN mutations related to cancer, PHTS, and HCPS, protPheMut achieved an AUROC of 0.9349 through micro averaging. Using SHAP model explanations, protPheMut highlights the strength of network and dynamic features in deeper uncovering of the effects of pathogenic mutations, thus classifying different disease phenotypes.
Authors
Keywords
No keywords available for this article.