Keeping up with the genomes: efficient learning of our increasing knowledge of the tree of life.

Journal: BMC bioinformatics

Published Date: Sep 21, 2020

Abstract

BACKGROUND: It is a computational challenge for current metagenomic classifiers to keep up with the pace of training data generated from genome sequencing projects, such as the exponentially-growing NCBI RefSeq bacterial genome database. When new reference sequences are added to training data, statically trained classifiers must be rerun on all data, resulting in a highly inefficient process. The rich literature of "incremental learning" addresses the need to update an existing classifier to accommodate new data without sacrificing much accuracy compared to retraining the classifier with all data.

Authors

Zhengqiao Zhao

Ecological and Evolutionary Signal-process and Informatics (EESI) Lab, Department of Electrical and Computer Engineering, Drexel University, Market Street, Philadelphia, US.
Alexandru Cristian

Department of Computer Science, Drexel University, Market Street, Philadelphia, US.
Gail Rosen

Keywords

Algorithms Bacteria Bayes Theorem Gastrointestinal Microbiome Genome, Bacterial Humans Machine Learning Metagenome Metagenomics Sequence Analysis, DNA

External Resources

View on PubMed Access via DOI PubMed (32957925)

Keeping up with the genomes: efficient learning of our increasing knowledge of the tree of life.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals