Contribution of Structure Learning Algorithms in Social Epidemiology: Application to Real-World Data.

Journal: International journal of environmental research and public health

PMID: 40238329

Abstract

Epidemiologists often handle large datasets with numerous variables and are currently seeing a growing wealth of techniques for data analysis, such as machine learning. Critical aspects involve addressing causality, often based on observational data, and dealing with the complex relationships between variables to uncover the overall structure of variable interactions, causal or not. Structure learning (SL) methods aim to automatically or semi-automatically reveal the structure of variables' relationships. The objective of this study is to delineate some of the potential contributions and limitations of structure learning methods when applied to social epidemiology topics and the search for determinants of healthcare system access. We applied SL techniques to a real-world dataset, namely the 2010 wave of the SIRS cohort, which included a sample of 3006 adults from the Paris region, France. Healthcare utilization, encompassing both direct and indirect access to care, was the primary outcome. Candidate determinants included health status, demographic characteristics, and socio-cultural and economic positions. We present two approaches: a non-automated epidemiological method (an initial expert knowledge network and stepwise logistic regression models) and three SL techniques using various algorithms, with and without knowledge constraints. We compared the results based on the presence, direction, and strength of specific links within the produced network. Although the interdependencies and relative strengths identified by both approaches were similar, the SL algorithms detect fewer associations with the outcome than the non-automated method. Relationships between variables were sometimes incorrectly oriented when using a purely data-driven approach. SL algorithms can be valuable in exploratory stages, helping to generate new hypotheses or mining novel databases. However, results should be validated against prior knowledge and supplemented with additional confirmatory analyses.

Authors

Helene Colineaux

EQUITY Team, Centre d'Epidémiologie et de Recherche en Santé des POPulations (CERPOP), Institut National de la Santé et de la Recherche Médicale (INSERM)-Toulouse III University, 37 Allées Jules Guesde, 31062 Toulouse, France.
Benoit Lepage

Medical Information Department, University Hospital of Toulouse, 31059 Toulouse, France. Electronic address: lepage.b@chu-toulouse.fr.
Pierre Chauvin

UMRS 1136, Pierre Louis Institute of Epidemiology and Public Health, Department of Social Epidemiology, Institut National de la Santé et de la Recherche Médicale (INSERM), Sorbonne University, 75005 Paris, France.
Chloe Dimeglio

Toulouse Institute for Infectious and Inflammatory Diseases (INFINITY), Institut National de la Santé et de la Recherche Médicale (INSERM), UMR 1291, Centre National de la Recherche Scientifique (CNRS), UMR 5051, 31300 Toulouse, France.
Cyrille Delpierre

EQUITY Team, Centre d'Epidémiologie et de Recherche en Santé des POPulations (CERPOP), Institut National de la Santé et de la Recherche Médicale (INSERM)-Toulouse III University, 37 Allées Jules Guesde, 31062 Toulouse, France.
Thomas Lefèvre

Hôpital Jean-Verdier (AP-HP), Department of Forensic Science and Medicine, F-93140 Bondy, France; IRIS - Institut de recherches interdisciplinaires sur les enjeux sociaux (UMR 8156-723), Bobigny, France. Electronic address: thomas.lefevre@univ-paris13.fr.

Keywords

Adult Aged Algorithms Cohort Studies Female France Humans Machine Learning Male Middle Aged Paris

External Resources

View on PubMed Access via DOI PubMed (40238329)

Contribution of Structure Learning Algorithms in Social Epidemiology: Application to Real-World Data.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals