A Deep Reinforcement Learning-Based Feature Selection Method for Invasive Disease Event Prediction Using Imbalanced Follow-Up Data.
Journal:
IEEE journal of biomedical and health informatics
PMID:
40030195
Abstract
The machine learning-based model is a promising paradigm for predicting invasive disease events (iDEs) in breast cancer. Feature selection (FS) is an essential preprocessing technique employed to identify the pertinent features for the prediction model. However, conventional FS methods often fail with imbalanced clinical data due to the bias towards the majority class. In this paper, a novel FS framework based on reinforcement learning (RLFS) is developed to identify the optimal feature subset for the imbalanced data. The RLFS employs an iterative methodology, wherein data resampling technique generates a balanced dataset before each iteration. A decision network is trained using a deep RL algorithm to identify the relevant features for the dataset in the current iteration. With such an iterative training strategy, numerous constructed datasets gradually boost the FS capacity of the decision network, resulting in a robust performance for imbalanced data. Finally, a weighted model is proposed to determine the most suitable FS solution. The RLFS is employed to predict breast cancer iDEs using real follow-up data. The comparison results demonstrated that RLFS effectively reduces the number of features while outperforming several state-of-the-art FS algorithms.