Leveraging MIMIC Datasets for Better Digital Health: A Review on Open Problems, Progress Highlights, and Future Promises
Journal:
arXiv
Published Date:
Jun 15, 2025
Abstract
The Medical Information Mart for Intensive Care (MIMIC) datasets have become
the Kernel of Digital Health Research by providing freely accessible,
deidentified records from tens of thousands of critical care admissions,
enabling a broad spectrum of applications in clinical decision support, outcome
prediction, and healthcare analytics. Although numerous studies and surveys
have explored the predictive power and clinical utility of MIMIC based models,
critical challenges in data integration, representation, and interoperability
remain underexplored. This paper presents a comprehensive survey that focuses
uniquely on open problems. We identify persistent issues such as data
granularity, cardinality limitations, heterogeneous coding schemes, and ethical
constraints that hinder the generalizability and real-time implementation of
machine learning models. We highlight key progress in dimensionality reduction,
temporal modelling, causal inference, and privacy preserving analytics, while
also outlining promising directions including hybrid modelling, federated
learning, and standardized preprocessing pipelines. By critically examining
these structural limitations and their implications, this survey offers
actionable insights to guide the next generation of MIMIC powered digital
health innovations.