LS-GKM: a new gkm-SVM for large-scale datasets.
Journal:
Bioinformatics (Oxford, England)
Published Date:
Mar 15, 2016
Abstract
UNLABELLED: gkm-SVM is a sequence-based method for predicting and detecting the regulatory vocabulary encoded in functional DNA elements, and is a commonly used tool for studying gene regulatory mechanisms. Here we introduce new software, LS-GKM, which removes several limitations of our previous releases, enabling training on much larger scale (LS) datasets. LS-GKM also provides additional advanced gapped k-mer based kernel functions. With these improvements, LS-GKM achieves considerably higher accuracy than the original gkm-SVM.