A Tunable Forced Alignment System Based on Deep Learning: Applications to Child Speech.
Journal:
Journal of speech, language, and hearing research : JSLHR
Published Date:
Mar 31, 2025
Abstract
PURPOSE: Phonetic forced alignment has a multitude of applications in automated analysis of speech, particularly in studying nonstandard speech such as children's speech. Manual alignment is tedious but serves as the gold standard for clinical-grade alignment. Current tools do not support direct training on manual alignments. Thus, a trainable speaker adaptive phonetic forced alignment system, Wav2TextGrid, was developed for children's speech. The source code for the method is publicly available along with a graphical user interface at https://github.com/pkadambi/Wav2TextGrid.