Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images.

Journal: The Journal of the Acoustical Society of America

Published Date: Jun 1, 2017

Abstract

Tongue gestural target classification is of great interest to researchers in the speech production field. Recently, deep convolutional neural networks (CNN) have shown superiority to standard feature extraction techniques in a variety of domains. In this letter, both CNN-based speaker-dependent and speaker-independent tongue gestural target classification experiments are conducted to classify tongue gestures during natural speech production. The CNN-based method achieves state-of-the-art performance, even though no pre-training of the CNN (with the exception of a data augmentation preprocessing) was carried out.

Authors

Kele Xu

Department of Engineering, Université Pierre et Marie Curie, Paris 75005, France kelele.xu@gmail.com.
Pierre Roussel

Langevin Institute, ESPCI-ParisTech, Paris 75005, France pierre.roussel@espci.fr.
Tamás Gábor Csapó

Department of Telecommunications and Media Informatics, Budapest University of Technology and Economics, Budapest, Hungary csapot@tmit.bme.hu.
Bruce Denby

Tianjin University, Tianjin, 300000 China bruce.denby@upmc.fr.

Keywords

Biomechanical Phenomena Deep Learning Female Gestures Humans Male Neural Networks, Computer Pattern Recognition, Automated Signal Processing, Computer-Assisted Speech Acoustics Tongue Ultrasonography Voice Quality

External Resources

View on PubMed Access via DOI PubMed (28618815)

Convolutional neural network-based automatic classification of midsagittal tongue gestural targets using B-mode ultrasound images.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals