To enable the application of deep learning in biology, we present Selene (https://selene.flatironinstitute.org/), a PyTorch-based deep learning library for fast and easy development, training, and application of deep learning model architectures for ...
Proceedings of the National Academy of Sciences of the United States of America
Mar 6, 2019
Deep learning methodologies have revolutionized prediction in many fields and show potential to do the same in molecular biology and genetics. However, applying these methods in their current forms ignores evolutionary dependencies within biological ...
Accurate detection of somatic mutations is still a challenge in cancer analysis. Here we present NeuSomatic, the first convolutional neural network approach for somatic mutation detection, which significantly outperforms previous methods on different...
Machine-learning approaches (MLAs) for DNA barcoding outperform distance- and tree-based methods on identification accuracy and cost-effectiveness to arrive at species-level identification of wood. DNA barcoding is a promising tool to combat illegal ...
The accurate identification of DNA sequence variants is an important, but challenging task in genomics. It is particularly difficult for single molecule sequencing, which has a per-nucleotide error rate of ~5-15%. Meeting this demand, we developed Cl...
An enhancer is a short (50-1500bp) region of DNA that plays an important role in gene expression and the production of RNA and proteins. Genetic variation in enhancers has been linked to many human diseases, such as cancer, disorder or inflammatory b...
Standard clinical interpretation of DNA copy number variants (CNVs) identified by cytogenomic microarray involves examining protein-coding genes within the region and comparison to other CNVs. Emerging basic research suggests that CNVs can also exert...
With the continuous accumulation of biological data, more and more machine learning algorithms have been introduced into the field of gene function prediction, which has great significance in decoding the secret of life. Recently, a multi-label super...
Database : the journal of biological databases and curation
Jan 1, 2019
High-throughput studies constitute an essential and valued source of information for researchers. However, high-throughput experimental workflows are often complex, with multiple data sets that may contain large numbers of false positives. The repres...
GenerationE of huge "omics" data necessitates the development and application of computational methods to annotate the data in terms of biological features. In the context of DNA sequence, it is important to unravel the hidden physicochemical signatu...