A machine learning approach to predict ethnicity using personal name and census location in Canada.
Journal:
PloS one
Published Date:
Jan 1, 2020
Abstract
BACKGROUND: Canada is an ethnically-diverse country, yet its lack of ethnicity information in many large databases impedes effective population research and interventions. Automated ethnicity classification using machine learning has shown potential to address this data gap but its performance in Canada is largely unknown. This study conducted a large-scale machine learning framework to predict ethnicity using a novel set of name and census location features.