Automatic GRBAS Scoring of Pathological Voices using Deep Learning and a Small Set of Labeled Voice Data.

Journal: Journal of voice : official journal of the Voice Foundation
PMID:

Abstract

OBJECTIVES: Auditory-perceptual evaluation frameworks, such as the grade-roughness-breathiness-asthenia-strain (GRBAS) scale, are the gold standard for the quantitative evaluation of pathological voice quality. However, the evaluation is subjective; thus, the ratings lack reproducibility due to inter- and intra-rater variation. Prior researchers have proposed deep-learning-based automatic GRBAS score estimation to address this problem. However, these methods require large amounts of labeled voice data. Therefore, this study investigates the potential of automatic GRBAS estimation using deep learning with smaller amounts of data.

Authors

  • Shunsuke Hidaka
    Graduate School of Design, Kyushu University, Fukuoka, Japan. Electronic address: hidaka.shunsuke.323@s.kyushu-u.ac.jp.
  • Yogaku Lee
    Graduate School of Design, Kyushu University, Fukuoka, Japan; Department of Otorhinolaryngology, Faculty of Medicine, Kyushu University, Fukuoka, Japan.
  • Moe Nakanishi
    Graduate School of Design, Kyushu University, Fukuoka, Japan.
  • Kohei Wakamiya
    Faculty of Design, Kyushu University, Fukuoka, Japan.
  • Takashi Nakagawa
    Department of Otorhinolaryngology, Faculty of Medicine, Kyushu University, Fukuoka, Japan.
  • Tokihiko Kaburagi
    Faculty of Design, Kyushu University, Fukuoka, Japan.