Predicting Age Groups of Reddit Users Based on Posting Behavior and Metadata: Classification Model Development and Validation.

Journal: JMIR public health and surveillance
PMID:

Abstract

BACKGROUND: Social media are important for monitoring perceptions of public health issues and for educating target audiences about health; however, limited information about the demographics of social media users makes it challenging to identify conversations among target audiences and limits how well social media can be used for public health surveillance and education outreach efforts. Certain social media platforms provide demographic information on followers of a user account, if given, but they are not always disclosed, and researchers have developed machine learning algorithms to predict social media users' demographic characteristics, mainly for Twitter. To date, there has been limited research on predicting the demographic characteristics of Reddit users.

Authors

  • Robert Chew
    Center for Data Science, RTI International, Research Triangle Park, NC, United States.
  • Caroline Kery
    Center for Data Science, RTI International, Research Triangle Park, NC, United States.
  • Laura Baum
    Center for Health Analytics, Media, and Policy, RTI International, Atlanta, GA, United States.
  • Thomas Bukowski
    Center for Health Analytics, Media, and Policy, RTI International, Berkeley, CA, United States.
  • Annice Kim
    Center for Health Analytics, Media, and Policy, RTI International, Research Triangle Park, NC, United States.
  • Mario Navarro
    Office of Health Communications and Education, Center for Tobacco Products, US Food and Drug Administration, Silver Spring, MD, United States.