DREAMER: a computational framework to evaluate readiness of datasets for machine learning.
Journal:
BMC medical informatics and decision making
Published Date:
Jun 4, 2024
Abstract
BACKGROUND: Machine learning (ML) has emerged as the predominant computational paradigm for analyzing large-scale datasets across diverse domains. The assessment of dataset quality stands as a pivotal precursor to the successful deployment of ML models. In this study, we introduce DREAMER (Data REAdiness for MachinE learning Research), an algorithmic framework leveraging supervised and unsupervised machine learning techniques to autonomously evaluate the suitability of tabular datasets for ML model development. DREAMER is openly accessible as a tool on GitHub and Docker, facilitating its adoption and further refinement within the research community..