Characterizing individual and methodological risk factors for survey non-completion using machine learning: findings from the U.S. Millennium Cohort Study.

Journal: BMC medical research methodology
Published Date:

Abstract

BACKGROUND: Missing survey data can threaten the validity and generalizability of findings from longitudinal cohort studies. Respondent characteristics and survey attributes may contribute to patterns of survey non-completion, a form of missing data in which respondents begin but do not finish a survey, that can lead to biased conclusions. The objectives of the present research are to demonstrate how machine learning can identify survey non-completion and to characterize individual and methodological factors that are associated with this form of data missingness.

Authors

  • Nate C Carnes
    Deployment Health Research Department, Naval Health Research Center, San Diego, CA, USA. nathan.c.carnes.mil@health.mil.
  • Claire A Kolaja
    Deployment Health Research Department, Naval Health Research Center, San Diego, CA, USA.
  • Crystal L Lewis
    Deployment Health Research Department, Naval Health Research Center, San Diego, CA, USA.
  • Sheila F CastaƱeda
    Deployment Health Research Department, Naval Health Research Center, San Diego, CA, USA.
  • Rudolph P Rull
    Deployment Health Research Department, Naval Health Research Center, San Diego, CA, USA.