Active learning using rough fuzzy classifier for cancer prediction from microarray gene expression data.

Journal: Journal of biomedical informatics
Published Date:

Abstract

Cancer classification from microarray gene expression data is one of the important areas of research in the field of computational biology and bioinformatics. Traditional supervised techniques often fail to produce desired accuracy as the number of clinically labeled patterns are very less. In such situation, active learning technique can play an important role as it computationally selects only few most informative (confusing) samples to be labeled by the experts and are added to the training set which inturn can improve the accuracy of the prediction. In this work a novel active learning method using rough-fuzzy classifier (ALRFC) is proposed for cancer sample classification using gene expression data. The proposed technique can handle uncertainty, overlappingness, and indiscernibility usually present in the subtype classes of the gene expression data. The proposed algorithm is tested using different publicly available benchmark cancer datasets and the performance is compared of the proposed method with three other active learning techniques, one semi-supervised classification algorithm, and two (non-active) supervised counterpart learning techniques in terms of prediction accuracy, precision, recall, F-measures and kappa. Superiority of the proposed method for cancer prediction over the other state-of-art techniques is established from the experimental results. Statistical significance of the better results achieved by the proposed method (in comparison to other methods) is also confirmed from the paired t-test results for most of the datasets.

Authors

  • Anindya Halder
    Dept. of Computer Applications, North-Eastern Hill University, Tura Campus, Meghalaya 794002, India.
  • Ansuman Kumar
    Dept. of Computer Applications, North-Eastern Hill University, Tura Campus, Meghalaya 794002, India. Electronic address: ansuman.kumar@gmail.com.