Data Leakage and Deceptive Performance: A Critical Examination of Credit Card Fraud Detection Methodologies
Journal:
arXiv
Published Date:
Jun 3, 2025
Abstract
This study critically examines the methodological rigor in credit card fraud
detection research, revealing how fundamental evaluation flaws can overshadow
algorithmic sophistication. Through deliberate experimentation with improper
evaluation protocols, we demonstrate that even simple models can achieve
deceptively impressive results when basic methodological principles are
violated. Our analysis identifies four critical issues plaguing current
approaches: (1) pervasive data leakage from improper preprocessing sequences,
(2) intentional vagueness in methodological reporting, (3) inadequate temporal
validation for transaction data, and (4) metric manipulation through recall
optimization at precision's expense. We present a case study showing how a
minimal neural network architecture with data leakage outperforms many
sophisticated methods reported in literature, achieving 99.9\% recall despite
fundamental evaluation flaws. These findings underscore that proper evaluation
methodology matters more than model complexity in fraud detection research. The
study serves as a cautionary example of how methodological rigor must precede
architectural sophistication, with implications for improving research
practices across machine learning applications.