Comparative Study of Classification Algorithms for Various DNA Microarray Data.

Journal: Genes
Published Date:

Abstract

Microarrays are applications of electrical engineering and technology in biology that allow simultaneous measurement of expression of numerous genes, and they can be used to analyze specific diseases. This study undertakes classification analyses of various microarrays to compare the performances of classification algorithms over different data traits. The datasets were classified into test and control groups based on five utilized machine learning methods, including MultiLayer Perceptron (MLP), Support Vector Machine (SVM), Decision Tree (DT), Random Forest (RF), and -Nearest Neighbors (KNN), and the resulting accuracies were compared. -fold cross-validation was used in evaluating the performance and the result was analyzed by comparing the performances of the five machine learning methods. Through the experiments, it was observed that the two tree-based methods, DT and RF, showed similar trends in results and the remaining three methods, MLP, SVM, and DT, showed similar trends. DT and RF generally showed worse performance than other methods except for one dataset. This suggests that, for the effective classification of microarray data, selecting a classification algorithm that is suitable for data traits is crucial to ensure optimum performance.

Authors

  • Jingeun Kim
    Department of IT Convergence Engineering, Gachon University, Seongnam-daero 1342, Seongnam-si 13120, Korea.
  • Yourim Yoon
    Department of Computer Engineering, College of Information Technology, Gachon University, 1342 Seongnam-daero, Sujeong-gu, Seongnam-si, Gyeonggi-do 461-701, Republic of Korea.
  • Hye-Jin Park
    Department of Food Science and Biotechnology, College of BioNano Technology, Gachon University, Seongnam-daero 1342, Sujeong-gu, Seongnam-si 13120, Korea.
  • Yong-Hyuk Kim
    Department of Computer Science & Engineering, Kwangwoon University, 20 Kwangwoon-ro, Nowon-gu, Seoul 139-701, Republic of Korea.