AIMC Topic: Speech

Clear Filters Showing 231 to 240 of 395 articles

Convolutional fusion network for monaural speech enhancement.

Neural networks : the official journal of the International Neural Network Society
Convolutional neural network (CNN) based methods, such as the convolutional encoder-decoder network, offer state-of-the-art results in monaural speech enhancement. In the conventional encoder-decoder network, large kernel size is often used to enhanc...

Streaming cascade-based speech translation leveraged by a direct segmentation model.

Neural networks : the official journal of the International Neural Network Society
The cascade approach to Speech Translation (ST) is based on a pipeline that concatenates an Automatic Speech Recognition (ASR) system followed by a Machine Translation (MT) system. Nowadays, state-of-the-art ST systems are populated with deep neural ...

Learning to recognize while learning to speak: Self-supervision and developing a speaking motor.

Neural networks : the official journal of the International Neural Network Society
Traditionally, learning speech synthesis and speech recognition were investigated as two separate tasks. This separation hinders incremental development for concurrent synthesis and recognition, where partially-learned synthesis and partially-learned...

Anti-transfer learning for task invariance in convolutional neural networks for speech processing.

Neural networks : the official journal of the International Neural Network Society
We introduce the novel concept of anti-transfer learning for speech processing with convolutional neural networks. While transfer learning assumes that the learning process for a target task will benefit from re-using representations learned for anot...

Combination of deep speaker embeddings for diarisation.

Neural networks : the official journal of the International Neural Network Society
Significant progress has recently been made in speaker diarisation after the introduction of d-vectors as speaker embeddings extracted from neural network (NN) speaker classifiers for clustering speech segments. To extract better-performing and more ...

A dual-stream deep attractor network with multi-domain learning for speech dereverberation and separation.

Neural networks : the official journal of the International Neural Network Society
Deep attractor networks (DANs) perform speech separation with discriminative embeddings and speaker attractors. Compared with methods based on the permutation invariant training (PIT), DANs define a deep embedding space and deliver a more elaborate r...

Deep neural network-based generalized sidelobe canceller for dual-channel far-field speech recognition.

Neural networks : the official journal of the International Neural Network Society
The traditional generalized sidelobe canceller (GSC) is a common speech enhancement front end to improve the noise robustness of automatic speech recognition (ASR) systems in the far-field cases. However, the traditional GSC is optimized based on the...

What Can Network Science Tell Us About Phonology and Language Processing?

Topics in cognitive science
Contemporary psycholinguistic models place significant emphasis on the cognitive processes involved in the acquisition, recognition, and production of language but neglect many issues related to the representation of language-related information in t...

Improving Loanword Identification in Low-Resource Language with Data Augmentation and Multiple Feature Fusion.

Computational intelligence and neuroscience
Loanword identification is studied in recent years to alleviate data sparseness in several natural language processing (NLP) tasks, such as machine translation, cross-lingual information retrieval, and so on. However, recent studies on this topic usu...

Deep learning architectures for estimating breathing signal and respiratory parameters from speech recordings.

Neural networks : the official journal of the International Neural Network Society
Respiration is an essential and primary mechanism for speech production. We first inhale and then produce speech while exhaling. When we run out of breath, we stop speaking and inhale. Though this process is involuntary, speech production involves a ...