AIMC Topic: Speech Recognition Software

Clear Filters Showing 31 to 40 of 94 articles

Deep joint learning for language recognition.

Neural networks : the official journal of the International Neural Network Society
Deep learning methods for language recognition have achieved promising performance. However, most of the studies focus on frameworks for single types of acoustic features and single tasks. In this paper, we propose the deep joint learning strategies ...

CiwGAN and fiwGAN: Encoding information in acoustic data to model lexical learning with Generative Adversarial Networks.

Neural networks : the official journal of the International Neural Network Society
How can deep neural networks encode information that corresponds to words in human speech into raw acoustic data? This paper proposes two neural network architectures for modeling unsupervised lexical learning from raw acoustic inputs: ciwGAN (Catego...

D-MONA: A dilated mixed-order non-local attention network for speaker and language recognition.

Neural networks : the official journal of the International Neural Network Society
Attention-based convolutional neural network (CNN) models are increasingly being adopted for speaker and language recognition (SR/LR) tasks. These include time, frequency, spatial and channel attention, which can focus on useful time frames, frequenc...

Speaker recognition based on deep learning: An overview.

Neural networks : the official journal of the International Neural Network Society
Speaker recognition is a task of identifying persons from their voices. Recently, deep learning has dramatically revolutionized speaker recognition. However, there is lack of comprehensive reviews on the exciting progress. In this paper, we review se...

Cycle consistent network for end-to-end style transfer TTS training.

Neural networks : the official journal of the International Neural Network Society
In this paper, we propose a cycle consistent network based end-to-end TTS for speaking style transfer, including intra-speaker, inter-speaker, and unseen speaker style transfer for both parallel and unparallel transfers. The proposed approach is buil...

Residual Neural Network precisely quantifies dysarthria severity-level based on short-duration speech segments.

Neural networks : the official journal of the International Neural Network Society
Recently, we have witnessed Deep Learning methodologies gaining significant attention for severity-based classification of dysarthric speech. Detecting dysarthria, quantifying its severity, are of paramount importance in various real-life application...

Speech Technology for Healthcare: Opportunities, Challenges, and State of the Art.

IEEE reviews in biomedical engineering
Speech technology is not appropriately explored even though modern advances in speech technology-especially those driven by deep learning (DL) technology-offer unprecedented opportunities for transforming the healthcare industry. In this paper, we ha...

μ-law SGAN for generating spectra with more details in speech enhancement.

Neural networks : the official journal of the International Neural Network Society
The goal of monaural speech enhancement is to separate clean speech from noisy speech. Recently, many studies have employed generative adversarial networks (GAN) to deal with monaural speech enhancement tasks. When using generative adversarial networ...

Digital health technologies: opportunities and challenges in rheumatology.

Nature reviews. Rheumatology
The past decade in rheumatology has seen tremendous innovation in digital health technologies, including the electronic health record, virtual visits, mobile health, wearable technology, digital therapeutics, artificial intelligence and machine learn...

Machine learning and natural language processing methods to identify ischemic stroke, acuity and location from radiology reports.

PloS one
Accurate, automated extraction of clinical stroke information from unstructured text has several important applications. ICD-9/10 codes can misclassify ischemic stroke events and do not distinguish acuity or location. Expeditious, accurate data extra...