Deciphering the dark cancer phosphoproteome using machine-learned co-regulation of phosphosites.
Journal:
Nature communications
PMID:
40113755
Abstract
Mass spectrometry-based phosphoproteomics offers a comprehensive view of protein phosphorylation, yet our limited knowledge about the regulation and function of most phosphosites hampers the extraction of meaningful biological insights. To address this challenge, we integrate machine learning with phosphoproteomic data from 1195 tumor specimens spanning 11 cancer types to construct CoPheeMap, a network that maps the co-regulation of 26,280 phosphosites. By incorporating network features from CoPheeMap into a second machine learning model, namely CoPheeKSA, we achieve superior performance in predicting kinase-substrate associations. CoPheeKSA uncovers 24,015 associations between 9399 phosphosites and 104 serine/threonine kinases, shedding light on many unannotated phosphosites and understudied kinases. We validate the accuracy of these predictions using experimentally determined kinase-substrate specificities. Through the application of CoPheeMap and CoPheeKSA to phosphosites with high computationally predicted functional significance and those associated with cancer, we demonstrate their effectiveness in systematically elucidating phosphosites of interest. These analyses unveil dysregulated signaling processes in human cancer and identify understudied kinases as potential therapeutic targets.