Budget constrained non-monotonic feature selection.

Journal: Neural networks : the official journal of the International Neural Network Society
PMID:

Abstract

Feature selection is an important problem in machine learning and data mining. We consider the problem of selecting features under the budget constraint on the feature subset size. Traditional feature selection methods suffer from the "monotonic" property. That is, if a feature is selected when the number of specified features is set, it will always be chosen when the number of specified feature is larger than the previous setting. This sacrifices the effectiveness of the non-monotonic feature selection methods. Hence, in this paper, we develop an algorithm for non-monotonic feature selection that approximates the related combinatorial optimization problem by a Multiple Kernel Learning (MKL) problem. We justify the performance guarantee for the derived solution when compared to the global optimal solution for the related combinatorial optimization problem. Finally, we conduct a series of empirical evaluation on both synthetic and real-world benchmark datasets for the classification and regression tasks to demonstrate the promising performance of the proposed framework compared with the baseline feature selection approaches.

Authors

  • Haiqin Yang
    Shenzhen Key Laboratory of Rich Media Big Data Analytics and Application, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong; Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong. Electronic address: hqyang@ieee.org.
  • Zenglin Xu
    Big Data Research Center, University of Electronic Science & Technology, Chengdu, Sichuan, China; School of Computer Science and Engineering, University of Electronic Science & Technology, Chengdu, Sichuan, China. Electronic address: zlxu@uestc.edu.cn.
  • Michael R Lyu
    Shenzhen Key Laboratory of Rich Media Big Data Analytics and Application, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong; Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong.
  • Irwin King
    Shenzhen Key Laboratory of Rich Media Big Data Analytics and Application, Shenzhen Research Institute, The Chinese University of Hong Kong, Hong Kong; Computer Science & Engineering, The Chinese University of Hong Kong, Hong Kong.