FLOating-Window Projective Separator (FloWPS): A Data Trimming Tool for Support Vector Machines (SVM) to Improve Robustness of the Classifier.

Journal: Frontiers in genetics
Published Date:

Abstract

Here, we propose a heuristic technique of data trimming for SVM termed (), tailored for personalized predictions based on molecular data. This procedure can operate with high throughput genetic datasets like gene expression or mutation profiles. Its application prevents SVM from extrapolation by excluding non-informative features. FloWPS requires training on the data for the individuals with known clinical outcomes to create a clinically relevant classifier. The genetic profiles linked with the outcomes are broken as usual into the training and validation datasets. The unique property of FloWPS is that irrelevant features in dataset that don't have significant number of neighboring hits in the dataset are removed from further analyses. Next, similarly to the nearest neighbors (kNN) method, for each point of a dataset, FloWPS takes into account only the proximal points of the dataset. Thus, for every point of a dataset, the dataset is adjusted to form a . FloWPS performance was tested on ten gene expression datasets for 992 cancer patients either responding or not on the different types of chemotherapy. We experimentally confirmed by leave-one-out cross-validation that FloWPS enables to significantly increase quality of a classifier built based on the classical SVM in most of the applications, particularly for polynomial kernels.

Authors

  • Victor Tkachev
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.
  • Maxim Sorokin
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.
  • Artem Mescheryakov
    Yandex N.V. Corporation, Moscow, Russia.
  • Alexander Simonov
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.
  • Andrew Garazha
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.
  • Anton Buzdin
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.
  • Ilya Muchnik
    Hill Center, Rutgers University, Piscataway, NJ, United States.
  • Nicolas Borisov
    Department of Bioinformatics and Molecular Networks, OmicsWay Corporation, Walnut, CA, United States.

Keywords

No keywords available for this article.