An autoencoder learning method for predicting breast cancer subtypes.

Journal: PloS one
Published Date:

Abstract

Heterogeneity of breast cancer poses several challenges for detection and treatment. With next-generation sequencing, we can now map the transcriptional profile of each patient's breast tissue, which has the potential for identifying and characterizing cancer subtypes. However, the large dimensionality of this transcriptomic data and the heterogeneity between the molecular profiles of breast cancers poses a barrier to identifying minimal markers and mechanistic consequences. In this study, we develop an autoencoder to identify a reduced set of gene markers that characterize the four major breast cancer subtypes with the accuracy of 82.38%. The reduced feature space created by our model captures the functional characteristics of each breast cancer subtype highlighting mechanisms that are unique to each subtype as well as those that are shared. Our high prediction accuracy shows that our markers can be valuable for breast cancer subtype detection and have the potential to provide insights into mechanisms associated with each subtype.

Authors

  • Zahra Rostami
    Department of Computer Science and Engineering, University of California San Diego, San Diego, California, United States of America.
  • Kavitha Mukund
    Department of Bioengineering, University of California San Diego, San Diego, California, United States of America.
  • Maryam Masnadi-Shirazi
    University of California San Diego, Department of Bioengineering, La Jolla, CA, 92093, USA.
  • Shankar Subramaniam
    University of California San Diego, Departments of Bioengineering and Computer Science & Engineering, La Jolla, CA, 92093, USA. shankar@ucsd.edu.