Discriminating Neoplastic from Nonneoplastic Tissues Using an miRNA-Based Deep Cancer Classifier.

Journal: The American journal of pathology
Published Date:

Abstract

Next-generation sequencing has enabled the collection of large biological data sets, allowing novel molecular-based classification methods to be developed for increased understanding of disease. miRNAs are small regulatory RNA molecules that can be quantified using next-generation sequencing and are excellent classificatory markers. Herein, a deep cancer classifier (DCC) was adapted to differentiate neoplastic from nonneoplastic samples using comprehensive miRNA expression profiles from 1031 human breast and skin tissue samples. The classifier was fine-tuned and evaluated using 750 neoplastic and 281 nonneoplastic breast and skin tissue samples. Performance of the DCC was compared with two machine-learning classifiers: support vector machine and random forests. In addition, performance of feature extraction through the DCC was also compared with a developed feature selection algorithm, cancer specificity. The DCC had the highest performance of area under the receiver operating curve and high performance in both sensitivity and specificity, unlike machine-learning and feature selection models, which often performed well in one metric compared with the other. In particular, deep learning had noticeable advantages with highly heterogeneous data sets. In addition, our cancer specificity algorithm identified candidate biomarkers for differentiating neoplastic and nonneoplastic tissue samples (eg, miR-144 and miR-375 in breast cancer and miR-375 and miR-451 in skin cancer).

Authors

  • Emily Kaczmarek
    Medical Informatics Laboratory, School of Computing, Queen's University, 557 Goodwin Hall, Kingston, ON K7L 2N8, Canada.
  • Blake Pyman
    School of Computing, Queen's University, Kingston, Ontario K7L 3N6, Canada http://www.queensu.ca/, pyman@cs.queensu.ca.
  • Jina Nanayakkara
    Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, 88 Stuart St, Kingston, ON K7L 3N6, Canada.
  • Thomas Tuschl
    Laboratory of RNA Molecular Biology, Rockefeller University, New York, New York.
  • Kathrin Tyryshkin
    Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, 88 Stuart St, Kingston, ON K7L 3N6, Canada; School of Computing, Queen's University, 557 Goodwin Hall, Kingston, ON K7L 2N8, Canada. Electronic address: kt40@queensu.ca.
  • Neil Renwick
    Laboratory of Translational RNA Biology, Department of Pathology and Molecular Medicine, Queen's University, 88 Stuart St, Kingston, ON K7L 3N6, Canada.
  • Parvin Mousavi
    Medical Informatics Laboratory, School of Computing, Queen's University, 557 Goodwin Hall, Kingston, ON K7L 2N8, Canada.