Genome-wide pre-miRNA discovery from few labeled examples.

Journal: Bioinformatics (Oxford, England)

PMID: 29028911

Abstract

MOTIVATION: Although many machine learning techniques have been proposed for distinguishing miRNA hairpins from other stem-loop sequences, most of the current methods use supervised learning, which requires a very good set of positive and negative examples. Those methods have important practical limitations when they have to be applied to a real prediction task. First, there is the challenge of dealing with a scarce number of positive (well-known) pre-miRNA examples. Secondly, it is very difficult to build a good set of negative examples for representing the full spectrum of non-miRNA sequences. Thirdly, in any genome, there is a huge class imbalance (1: 10 000) that is well-known for particularly affecting supervised classifiers.

Authors

C Yones

Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina.
G Stegmayer

Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina.
D H Milone

Research Institute for Signals, Systems and Computational Intelligence, sinc(i), UNL-CONICET. Ciudad Universitaria, 4to piso FICH, Santa Fe 3000, Argentina.

Keywords

Animals Anopheles Arabidopsis Caenorhabditis elegans Computational Biology Eukaryota Genome Genomics MicroRNAs Nucleic Acid Conformation Sequence Analysis, DNA Sequence Analysis, RNA Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (29028911)

Genome-wide pre-miRNA discovery from few labeled examples.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals