LS-GKM: a new gkm-SVM for large-scale datasets.

Journal: Bioinformatics (Oxford, England)
Published Date:

Abstract

UNLABELLED: gkm-SVM is a sequence-based method for predicting and detecting the regulatory vocabulary encoded in functional DNA elements, and is a commonly used tool for studying gene regulatory mechanisms. Here we introduce new software, LS-GKM, which removes several limitations of our previous releases, enabling training on much larger scale (LS) datasets. LS-GKM also provides additional advanced gapped k-mer based kernel functions. With these improvements, LS-GKM achieves considerably higher accuracy than the original gkm-SVM.

Authors

  • Dongwon Lee
    McKusick-Nathans Institute of Genetic Medicine, Johns Hopkins University, Baltimore, MD 21205, USA.