Identification of informative genes and pathways using an improved penalized support vector machine with a weighting scheme.

Journal: Computers in biology and medicine
PMID:

Abstract

Incorporation of pathway knowledge into microarray analysis has brought better biological interpretation of the analysis outcome. However, most pathway data are manually curated without specific biological context. Non-informative genes could be included when the pathway data is used for analysis of context specific data like cancer microarray data. Therefore, efficient identification of informative genes is inevitable. Embedded methods like penalized classifiers have been used for microarray analysis due to their embedded gene selection. This paper proposes an improved penalized support vector machine with absolute t-test weighting scheme to identify informative genes and pathways. Experiments are done on four microarray data sets. The results are compared with previous methods using 10-fold cross validation in terms of accuracy, sensitivity, specificity and F-score. Our method shows consistent improvement over the previous methods and biological validation has been done to elucidate the relation of the selected genes and pathway with the phenotype under study.

Authors

  • Weng Howe Chan
    Artificial Intelligence and Bioinformatics Research Group, Faculty of Computing, Universiti Teknologi Malaysia, 81310 Skudai, Johor, Malaysia.
  • Mohd Saberi Mohamad
    Health Data Science Lab, Department of Genetics and Genomics, College of Medicine and Health Sciences, United Arab Emirates University, Al Ain, United Arab Emirates.
  • Safaai Deris
    Faculty of Creative Technology & Heritage, Universiti Malaysia Kelantan, Locked Bag 01, Bachok, 16300 Kota Bharu, Kelantan, Malaysia.
  • Nazar Zaki
    Department of Computer Science and Software Engineering, College of Information Technology, United Arab Emirates University, Al Ain, UAE.
  • Shahreen Kasim
    Faculty of Computer Science and Information Technology, Universiti Tun Hussein Onn Malaysia, 86400 Batu Pahat, Malaysia.
  • Sigeru Omatu
    Department of Electronics, Information and Communication Engineering, Osaka Institute of Technology, Osaka 535-8585, Japan.
  • Juan Manuel Corchado
    Biomedical Research Institute of Salamanca/BISITE Research Group, University of Salamanca, Salamanca, Spain.
  • Hany Al Ashwal
    College of Information Technology, United Arab Emirate University, Al Ain 15551, United Arab Emirates.