Fairness in Classifying and Grouping Health Equity Information.

Journal: Studies in health technology and informatics
Published Date:

Abstract

This paper explores the balance between fairness and performance in machine learning classification, predicting the likelihood of a patient receiving anti-microbial treatment using structured data in community nursing wound care electronic health records. The data includes two important predictors (gender and language) of the social determinants of health, which we used to evaluate the fairness of the classifiers. At the same time, the impact of various groupings of language codes on classifiers' performance and fairness is analyzed. Most common statistical learning-based classifiers are evaluated. The findings indicate that while K-Nearest Neighbors offers the best fairness metrics among different grouping settings, the performance of all classifiers is generally consistent across different language code groupings. Also, grouping more variables tends to improve the fairness metrics over all classifiers while maintaining their performance.

Authors

  • Ruinan Jin
    Computer Science Department, The University of British Columbia, BC, V6T 1Z4, Canada.
  • Xiaoxiao Li
    Yale University, 06510 New Haven, CT USA.
  • Lorraine J Block
    School of Nursing, University of British Columbia, Vancouver, British Columbia, Canada.
  • Ivan Beschastnikh
    The University of British Columbia, Vancouver and Okanagan, BC, Canada.
  • Leanne M Currie
  • Charlene E Ronquillo
    School of Nursing, University of British Columbia Okanagan, Kelowna, BC, Canada. Electronic address: charlene.ronquillo@ubc.ca.