Hybrid feature selection framework for enhanced credit card fraud detection using machine learning models.
Journal:
PloS one
Published Date:
Jul 16, 2025
Abstract
Electronic payment methods are increasingly prevalent worldwide, facilitating both in-person and online transactions. As credit card usage for online payments grows, fraud and payment defaults have also risen, resulting in significant financial losses. Detecting fraudulent transactions is challenging due to the highly imbalanced nature of transaction datasets, where fraudulent activities constitute only a small fraction of the data. To address this, we propose a novel hybrid feature selection framework designed to enhance the performance of machine learning models in credit card fraud detection. Our framework integrates three complementary feature selection techniques: Pearson correlation, information gain (IG), and random forest importance (RFI), each optimized for the dataset's characteristics. Pearson Correlation eliminates redundancy by removing highly correlated features, while IG and RFI evaluate the relevance of the remaining features. A union operation combines the most informative features from these methods, ensuring comprehensive and efficient feature selection. To validate the proposed approach, we test it on five diverse datasets with varying characteristics and imbalance levels, employing five state-of-the-art machine learning algorithms: Random Forest (RF), Extra Trees (ET), XGBoost (XGBC), AdaBoost, and CatBoost. We primarily propose this work for PCA-transformed datasets, but for the validation of our research, we also apply it to a real-world dataset. The results demonstrate that our methodology outperforms existing baseline approaches, achieving superior fraud detection performance across all datasets. Our findings highlight the robustness and adaptability of the proposed framework, offering a practical solution for real-world fraud detection systems. Additionally, we believe that our proposed framework can serve as a decision support system for the detection of fraudulent transactions in real-time credit cards, with the potential to make a substantial contribution to the business industry.