Machine Learning and Statistical Insights into Hospital Stay Durations: The Italian EHR Case
Journal:
arXiv
Published Date:
Apr 25, 2025
Abstract
Length of hospital stay is a critical metric for assessing healthcare quality
and optimizing hospital resource management. This study aims to identify
factors influencing LoS within the Italian healthcare context, using a dataset
of hospitalization records from over 60 healthcare facilities in the Piedmont
region, spanning from 2020 to 2023. We explored a variety of features,
including patient characteristics, comorbidities, admission details, and
hospital-specific factors. Significant correlations were found between LoS and
features such as age group, comorbidity score, admission type, and the month of
admission. Machine learning models, specifically CatBoost and Random Forest,
were used to predict LoS. The highest R2 score, 0.49, was achieved with
CatBoost, demonstrating good predictive performance.