A Network-Guided Penalized Regression with Application to Proteomics Data
Journal:
arXiv
Published Date:
May 29, 2025
Abstract
Network theory has proven invaluable in unraveling complex protein
interactions. Previous studies have employed statistical methods rooted in
network theory, including the Gaussian graphical model, to infer networks among
proteins, identifying hub proteins based on key structural properties of
networks such as degree centrality. However, there has been limited research
examining a prognostic role of hub proteins on outcomes, while adjusting for
clinical covariates in the context of high-dimensional data. To address this
gap, we propose a network-guided penalized regression method. First, we
construct a network using the Gaussian graphical model to identify hub
proteins. Next, we preserve these identified hub proteins along with clinically
relevant factors, while applying adaptive Lasso to non-hub proteins for
variable selection. Our network-guided estimators are shown to have variable
selection consistency and asymptotic normality. Simulation results suggest that
our method produces better results compared to existing methods and
demonstrates promise for advancing biomarker identification in proteomics
research. Lastly, we apply our method to the Clinical Proteomic Tumor Analysis
Consortium (CPTAC) data and identified hub proteins that may serve as
prognostic biomarkers for various diseases, including rare genetic disorders
and immune checkpoint for cancer immunotherapy.