Biological sequence modeling with convolutional kernel networks.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Sep 15, 2019
Abstract
MOTIVATION: The growing number of annotated biological sequences available makes it possible to learn genotype-phenotype relationships from data with increasingly high accuracy. When large quantities of labeled samples are available for training a model, convolutional neural networks can be used to predict the phenotype of unannotated sequences with good accuracy. Unfortunately, their performance with medium- or small-scale datasets is mitigated, which requires inventing new data-efficient approaches.