A Tunable Forced Alignment System Based on Deep Learning: Applications to Child Speech.

Journal: Journal of speech, language, and hearing research : JSLHR
Published Date:

Abstract

PURPOSE: Phonetic forced alignment has a multitude of applications in automated analysis of speech, particularly in studying nonstandard speech such as children's speech. Manual alignment is tedious but serves as the gold standard for clinical-grade alignment. Current tools do not support direct training on manual alignments. Thus, a trainable speaker adaptive phonetic forced alignment system, Wav2TextGrid, was developed for children's speech. The source code for the method is publicly available along with a graphical user interface at https://github.com/pkadambi/Wav2TextGrid.

Authors

  • Prad Kadambi
    School of Electrical, Computer and Energy Engineering, Arizona State University, Tempe.
  • Tristan J Mahr
    Waisman Center, University of Wisconsin-Madison.
  • Katherine C Hustad
    Department of Communication Sciences and Disorders, University of Wisconsin-Madison.
  • Visar Berisha