Precision Adaptive Imputation Network : An Unified Technique for Mixed Datasets
Journal:
arXiv
Published Date:
Jan 18, 2025
Abstract
The challenge of missing data remains a significant obstacle across various
scientific domains, necessitating the development of advanced imputation
techniques that can effectively address complex missingness patterns. This
study introduces the Precision Adaptive Imputation Network (PAIN), a novel
algorithm designed to enhance data reconstruction by dynamically adapting to
diverse data types, distributions, and missingness mechanisms. PAIN employs a
tri-step process that integrates statistical methods, random forests, and
autoencoders, ensuring balanced accuracy and efficiency in imputation. Through
rigorous evaluation across multiple datasets, including those characterized by
high-dimensional and correlated features, PAIN consistently outperforms
traditional imputation methods, such as mean and median imputation, as well as
other advanced techniques like MissForest. The findings highlight PAIN's
superior ability to preserve data distributions and maintain analytical
integrity, particularly in complex scenarios where missingness is not
completely at random. This research not only contributes to a deeper
understanding of missing data reconstruction but also provides a critical
framework for future methodological innovations in data science and machine
learning, paving the way for more effective handling of mixed-type datasets in
real-world applications.