A semi-supervised machine learning framework for microRNA classification.

Journal: Human genomics
Published Date:

Abstract

BACKGROUND: MicroRNAs (miRNAs) are a family of short, non-coding RNAs that have been linked to critical cellular activities, most notably regulation of gene expression. The identification of miRNA is a cross-disciplinary approach that requires both computational identification methods and wet-lab validation experiments, making it a resource-intensive procedure. While numerous machine learning methods have been developed to increase classification accuracy and thus reduce validation costs, most methods use supervised learning and thus require large labeled training data sets, often not feasible for less-sequenced species. On the other hand, there is now an abundance of unlabeled RNA sequence data due to the emergence of high-throughput wet-lab experimental procedures, such as next-generation sequencing.

Authors

  • Mohsen Sheikh Hassani
    Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada.
  • James R Green
    Department of Systems and Computer Engineering, Carleton University, Ottawa, Ontario, Canada. jrgreen@sce.carleton.ca.