Confidence interval for micro-averaged and macro-averaged scores.

Journal: Applied intelligence (Dordrecht, Netherlands)
Published Date:

Abstract

A binary classification problem is common in medical field, and we often use sensitivity, specificity, accuracy, negative and positive predictive values as measures of performance of a binary predictor. In computer science, a classifier is usually evaluated with precision (positive predictive value) and recall (sensitivity). As a single summary measure of a classifier's performance, score, defined as the harmonic mean of precision and recall, is widely used in the context of information retrieval and information extraction evaluation since it possesses favorable characteristics, especially when the prevalence is low. Some statistical methods for inference have been developed for the score in binary classification problems; however, they have not been extended to the problem of multi-class classification. There are three types of scores, and statistical properties of these scores have hardly ever been discussed. We propose methods based on the large sample multivariate central limit theorem for estimating scores with confidence intervals.

Authors

  • Kanae Takahashi
    Department of Medical Statistics, Osaka City University Graduate School of Medicine, Osaka, Japan.
  • Kouji Yamamoto
    Department of Biostatistics, School of Medicine, Yokohama City University, Yokohama, Japan.
  • Aya Kuchiba
    Graduate School of Health Innovation, Kanagawa University of Human Services, Kanagawa, Japan.
  • Tatsuki Koyama
    Department of Biostatistics, Vanderbilt University Medical Center, Nashville, TN, USA.

Keywords

No keywords available for this article.