Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.

Journal: Bioinformatics (Oxford, England)

Published Date: Feb 15, 2017

Abstract

MOTIVATION: Many peak detection algorithms have been proposed for ChIP-seq data analysis, but it is not obvious which algorithm and what parameters are optimal for any given dataset. In contrast, regions with and without obvious peaks can be easily labeled by visual inspection of aligned read counts in a genome browser. We propose a supervised machine learning approach for ChIP-seq data analysis, using labels that encode qualitative judgments about which genomic regions contain or do not contain peaks. The main idea is to manually label a small subset of the genome, and then learn a model that makes consistent peak predictions on the rest of the genome.

Authors

Toby Dylan Hocking

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.
Patricia Goerner-Potvin

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.
Andreanne Morin

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.
Xiaojian Shao

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.
Tomi Pastinen

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.
Guillaume Bourque

Department of Human Genetics, McGill University, H3A-1A4, Montréal, Canada.

Keywords

Animals Chromatin Immunoprecipitation Genomics Humans Sequence Analysis, DNA Software Supervised Machine Learning

External Resources

View on PubMed Access via DOI PubMed (27797775)

Optimizing ChIP-seq peak detectors using visual labels and supervised machine learning.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals