Analysis of heterogeneous genomic samples using image normalization and machine learning.
Journal:
BMC genomics
Published Date:
Dec 21, 2020
Abstract
BACKGROUND: Analysis of heterogeneous populations such as viral quasispecies is one of the most challenging bioinformatics problems. Although machine learning models are becoming to be widely employed for analysis of sequence data from such populations, their straightforward application is impeded by multiple challenges associated with technological limitations and biases, difficulty of selection of relevant features and need to compare genomic datasets of different sizes and structures.