Unsupervised feature selection with evolutionary sparsity.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

The ℓ-norm is playing an increasingly important role in unsupervised feature selection. However, existing algorithm for optimization problem with ℓ-norm constraint has two problems: First, they cannot automatically determine the sparsity, also known as the number of key features. Second, they have the risk of converging towards local optima, therefore selecting trivial (less informative) features. To address these problems, this paper proposes an unsupervised feature selection method with evolutionary sparsity (EVSP), which integrates the feature selection process with a sparse projection matrix and population search mechanisms into a unified unsupervised feature selection framework. Specifically, the level of sparsity is encoded as population individuals, and subsequently, a multi-objective evolutionary algorithm based on binary encoding is introduced to recursively determine the optimal level of sparsity, thus unsupervisedly guiding the learning of an optimal row-sparse projection matrix. Moreover, by utilizing the feature weights learned through sparse projection, a two-stage strategy called the mutation-repair operator is designed to steer the evolution of the population, aiming to generate high-quality candidate solutions. Comprehensive experiments on eleven benchmark datasets, with a maximum dimensionality of 10304 features and a maximum size of 9298 samples, demonstrate that the proposed EVSP method can effectively determine the optimal sparsity level, significantly outperforming several state-of-the-art methods.

Authors

  • Shixuan Zhou
    School of Software Engineering, South China University of Technology, Guangzhou 510006, China. Electronic address: 202410190257@mail.scut.edu.cn.
  • Yi Xiang
    Department of Ophthalmology, The Central Hospital of Wuhan, Tongji Medical College, Huazhong University of Science and Technology, Wuhan, Hubei, China.
  • Han Huang
    School of Software Engineering, South China University of Technology, Guangzhou 510006, China. hhan@scut.edu.cn.
  • Pei Huang
    MRC Cognition and Brain Sciences Unit, University of Cambridge, Cambridge, United Kingdom.
  • Chaoda Peng
    School of Mathematics and Informatics, South China Agricultural University, Guangzhou, 510006, China. Electronic address: chaodapeng@scau.edu.cn.
  • Xiaowei Yang
    Key Laboratory of Flexible Electronics (KLOFE) & Institute of Advanced Materials (IAM), Nanjing Tech University (Nanjing Tech), Nanjing 211816, P. R. China.
  • Peng Song