Cancer subtyping with heterogeneous multi-omics data via hierarchical multi-kernel learning.

Journal: Briefings in bioinformatics
Published Date:

Abstract

Differentiating cancer subtypes is crucial to guide personalized treatment and improve the prognosis for patients. Integrating multi-omics data can offer a comprehensive landscape of cancer biological process and provide promising ways for cancer diagnosis and treatment. Taking the heterogeneity of different omics data types into account, we propose a hierarchical multi-kernel learning (hMKL) approach, a novel cancer molecular subtyping method to identify cancer subtypes by adopting a two-stage kernel learning strategy. In stage 1, we obtain a composite kernel borrowing the cancer integration via multi-kernel learning (CIMLR) idea by optimizing the kernel parameters for individual omics data type. In stage 2, we obtain a final fused kernel through a weighted linear combination of individual kernels learned from stage 1 using an unsupervised multiple kernel learning method. Based on the final fusion kernel, k-means clustering is applied to identify cancer subtypes. Simulation studies show that hMKL outperforms the one-stage CIMLR method when there is data heterogeneity. hMKL can estimate the number of clusters correctly, which is the key challenge in subtyping. Application to two real data sets shows that hMKL identified meaningful subtypes and key cancer-associated biomarkers. The proposed method provides a novel toolkit for heterogeneous multi-omics data integration and cancer subtypes identification.

Authors

  • Yifang Wei
    Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China.
  • Lingmei Li
    Department of Pathology, Tianjin Medical University Cancer Institute and Hospital, National Clinical Research Center for Cancer, Key Laboratory of Cancer Prevention and Therapy, Tianjin; Tianjin's Clinical Research Center for Cancer, Tianjin Medical University, Ministry of Education, Tianjin 300060, China.
  • Xin Zhao
    Florida International University.
  • Haitao Yang
    Jiangnan University.
  • Jian Sa
    Department of Science and Technology, Shanxi Provincial Key Laboratory of Major Disease Risk Assessment, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China.
  • Hongyan Cao
    Division of Health Statistics, School of Public Health, Shanxi Medical University, Taiyuan, Shanxi 030001, PR China.
  • Yuehua Cui
    Department of Statistics and Probability, Michigan State University, East Lansing, MI 48824, USA.