Exploring a Hybrid Deep Learning Approach for Anomaly Detection in Mental Healthcare Provider Billing: Addressing Label Scarcity through Semi-Supervised Anomaly Detection
Journal:
arXiv
Published Date:
Jul 2, 2025
Abstract
The complexity of mental healthcare billing enables anomalies, including
fraud. While machine learning methods have been applied to anomaly detection,
they often struggle with class imbalance, label scarcity, and complex
sequential patterns. This study explores a hybrid deep learning approach
combining Long Short-Term Memory (LSTM) networks and Transformers, with
pseudo-labeling via Isolation Forests (iForest) and Autoencoders (AE). Prior
work has not evaluated such hybrid models trained on pseudo-labeled data in the
context of healthcare billing. The approach is evaluated on two real-world
billing datasets related to mental healthcare. The iForest LSTM baseline
achieves the highest recall (0.963) on declaration-level data. On the
operation-level data, the hybrid iForest-based model achieves the highest
recall (0.744), though at the cost of lower precision. These findings highlight
the potential of combining pseudo-labeling with hybrid deep learning in
complex, imbalanced anomaly detection settings.