AIMC Topic: Open Reading Frames

Clear Filters Showing 11 to 20 of 23 articles

A Support Vector Machine based method to distinguish long non-coding RNAs from protein coding transcripts.

BMC genomics
BACKGROUND: In recent years, a rapidly increasing number of RNA transcripts has been generated by thousands of sequencing projects around the world, creating enormous volumes of transcript data to be analyzed. An important problem to be addressed whe...

Fangorn Forest (F2): a machine learning approach to classify genes and genera in the family Geminiviridae.

BMC bioinformatics
BACKGROUND: Geminiviruses infect a broad range of cultivated and non-cultivated plants, causing significant economic losses worldwide. The studies of the diversity of species, taxonomy, mechanisms of evolution, geographic distribution, and mechanisms...

Geminivirus data warehouse: a database enriched with machine learning approaches.

BMC bioinformatics
BACKGROUND: The Geminiviridae family encompasses a group of single-stranded DNA viruses with twinned and quasi-isometric virions, which infect a wide range of dicotyledonous and monocotyledonous plants and are responsible for significant economic los...

LncRNA-ID: Long non-coding RNA IDentification using balanced random forests.

Bioinformatics (Oxford, England)
MOTIVATION: Long non-coding RNAs (lncRNAs), which are non-coding RNAs of length above 200 nucleotides, play important biological functions such as gene expression regulation. To fully reveal the functions of lncRNAs, a fundamental step is to annotate...

Analysis of RNA translation with a deep learning architecture provides new insight into translation control.

Nucleic acids research
Accurate annotation of coding regions in RNAs is essential for understanding gene translation. We developed a deep neural network to directly predict and analyze translation initiation and termination sites from RNA sequences. Trained with human tran...

Discovering misannotated lncRNAs using deep learning training dynamics.

Bioinformatics (Oxford, England)
MOTIVATION: Recent experimental evidence has shown that some long non-coding RNAs (lncRNAs) contain small open reading frames (sORFs) that are translated into functional micropeptides, suggesting that these lncRNAs are misannotated as non-coding. Cur...

Feature extraction approaches for biological sequences: a comparative study of mathematical features.

Briefings in bioinformatics
As consequence of the various genomic sequencing projects, an increasing volume of biological sequence data is being produced. Although machine learning algorithms have been successfully applied to a large number of genomic sequence-related problems,...

DeepCPP: a deep neural network based on nucleotide bias information and minimum distribution similarity feature selection for RNA coding potential prediction.

Briefings in bioinformatics
The development of deep sequencing technologies has led to the discovery of novel transcripts. Many in silico methods have been developed to assess the coding potential of these transcripts to further investigate their functions. Existing methods per...

OCCAM: prediction of small ORFs in bacterial genomes by means of a target-decoy database approach and machine learning techniques.

Database : the journal of biological databases and curation
Small open reading frames (ORFs) have been systematically disregarded by automatic genome annotation. The difficulty in finding patterns in tiny sequences is the main reason that makes small ORFs to be overlooked by computational procedures. However,...