The Impact of Social Determinants on Cardiovascular Mortality: A Zip Code-Level Analysis in Indiana.
Journal:
Studies in health technology and informatics
Published Date:
Aug 7, 2025
Abstract
Cardiovascular disease (CVD) is the leading cause of death globally and is expected to become the top global cause of death by 2030. Although the role of social determinants of health (SDoH) in CVD outcomes is well-established, integrating these factors into predictive health informatics models remains limited. This study presents a scalable geospatial framework for predicting CVD mortality at the zip-code level in Indiana, using data from 2015 to 2022. We integrated diverse data sources, including the U.S. Census Bureau, OpenStreetMap (via the Overpass API), Zillow, and the Indiana Department of Health, covering 773 zip codes. Our data preprocessing involved spatial normalization, robust scaling, and population-adjusted feature engineering to align socioeconomic, environmental, and healthcare infrastructure indicators. We tested several machine learning models, including Random Forest, XGBoost, and FastAI Tabular, with XGBoost delivering the best results (R2 = 0.9863; MAE = 2.29). To improve model interpretability, we used SHAP analysis, which identified education, race, income, and healthcare access as the most significant SDoH predictors of CVD mortality. This study contributes to health informatics by presenting a replicable pipeline for localized SDoH-based mortality prediction. It highlights the importance of integrating geospatial and SDoH data in predictive models and emphasizes the potential of informatics to address health disparities and guide targeted public health interventions at the community level.