A Deep Learning Approach for Quantifying Vocal Fold Dynamics During Connected Speech Using Laryngeal High-Speed Videoendoscopy.

Journal: Journal of speech, language, and hearing research : JSLHR
PMID:

Abstract

PURPOSE: Voice disorders are best assessed by examining vocal fold dynamics in connected speech. This can be achieved using flexible laryngeal high-speed videoendoscopy (HSV), which enables us to study vocal fold mechanics with high temporal details. Analysis of vocal fold vibration using HSV requires accurate segmentation of the vocal fold edges. This article presents an automated deep-learning scheme to segment the glottal area in HSV from which the glottal edges are derived during connected speech.

Authors

  • Ahmed M Yousef
    Department of Communicative Sciences and Disorders, Michigan State University, East Lansing.
  • Dimitar D Deliyski
    Department of Communicative Sciences and Disorders, Michigan State University, East Lansing.
  • Stephanie R C Zacharias
    Head and Neck Regenerative Medicine Program, Mayo Clinic, Scottsdale, AZ.
  • Alessandro de Alarcon
    Division of Pediatric Otolaryngology, Cincinnati Children's Hospital Medical Center, OH.
  • Robert F Orlikoff
    College of Allied Health Sciences, East Carolina University, Greenville, NC.
  • Maryam Naghibolhosseini
    Department of Communicative Sciences and Disorders, Michigan State University, East Lansing.