Classifying Breast Cancer Subtypes Using Multiple Kernel Learning Based on Omics Data.

Journal: Genes
PMID:

Abstract

It is very significant to explore the intrinsic differences in breast cancer subtypes. These intrinsic differences are closely related to clinical diagnosis and designation of treatment plans. With the accumulation of biological and medicine datasets, there are many different omics data that can be viewed in different aspects. Combining these multiple omics data can improve the accuracy of prediction. Meanwhile; there are also many different databases available for us to download different types of omics data. In this article, we use estrogen receptor (ER), progesterone receptor (PR), human epidermal growth factor receptor 2 (HER2) to define breast cancer subtypes and classify any two breast cancer subtypes using SMO-MKL algorithm. We collected mRNA data, methylation data and copy number variation (CNV) data from TCGA to classify breast cancer subtypes. Multiple Kernel Learning (MKL) is employed to use these omics data distinctly. The result of using three omics data with multiple kernels is better than that of using single omics data with multiple kernels. Furthermore; these significant genes and pathways discovered in the feature selection process are also analyzed. In experiments; the proposed method outperforms other state-of-the-art methods and has abundant biological interpretations.

Authors

  • Mingxin Tao
    Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China. lilytmx18@163.com.
  • Tianci Song
    Dept of Computer Science and Engineering, University of Minnesota Minneapolis, MN, USA.
  • Wei Du
    Department of Respiratory and Critical Care Medicine, Ruijin Hospital, Shanghai Jiao Tong University School of Medicine, Shanghai 200025, China.
  • Siyu Han
    Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China. hansy15@mails.jlu.edu.cn.
  • Chunman Zuo
    School of Life Sciences, Sun Yat-sen University, Guangzhou, 510275, China.
  • Ying Li
    School of Information Engineering, Chang'an University, Xi'an 710010, China.
  • Yan Wang
    College of Animal Science and Technology, Beijing University of Agriculture, Beijing, China.
  • Zekun Yang
    Key Laboratory of Symbol Computation and Knowledge Engineering of Ministry of Education, College of Computer Science and Technology, Jilin University, Changchun 130012, China. yangzkjlu@foxmail.com.