AIMC Topic: Speech

Clear Filters Showing 101 to 110 of 353 articles

Multimodal Sensor-Input Architecture with Deep Learning for Audio-Visual Speech Recognition in Wild.

Sensors (Basel, Switzerland)
This paper investigates multimodal sensor architectures with deep learning for audio-visual speech recognition, focusing on in-the-wild scenarios. The term "in the wild" is used to describe AVSR for unconstrained natural-language audio streams and vi...

Human-Computer Interaction with a Real-Time Speech Emotion Recognition with Ensembling Techniques 1D Convolution Neural Network and Attention.

Sensors (Basel, Switzerland)
Emotions have a crucial function in the mental existence of humans. They are vital for identifying a person's behaviour and mental condition. Speech Emotion Recognition (SER) is extracting a speaker's emotional state from their speech signal. SER is ...

A Survey on Low-Latency DNN-Based Speech Enhancement.

Sensors (Basel, Switzerland)
This paper presents recent advances in low-latency, single-channel, deep neural network-based speech enhancement systems. The sources of latency and their acceptable values in different applications are described. This is followed by an analysis of t...

A Deep Learning Method Using Gender-Specific Features for Emotion Recognition.

Sensors (Basel, Switzerland)
Speech reflects people's mental state and using a microphone sensor is a potential method for human-computer interaction. Speech recognition using this sensor is conducive to the diagnosis of mental illnesses. The gender difference of speakers affect...

Qualitative and Artificial Intelligence-based Sentiment Analyses of Anti-LGBTI+ Hate Speech on Twitter in Turkey.

Issues in mental health nursing
The aim of this study was to evaluate hate speech in Turkish LGBTI+-related tweets during a one-month period of artificial intelligence-based sentiment analyses. Turkish tweets related to LGBTI+, were retrieved using Python library Tweepy and were ev...

Classification of Depression and Its Severity Based on Multiple Audio Features Using a Graphical Convolutional Neural Network.

International journal of environmental research and public health
Audio features are physical features that reflect single or complex coordinated movements in the vocal organs. Hence, in speech-based automatic depression classification, it is critical to consider the relationship among audio features. Here, we prop...

Intelligent speech technologies for transcription, disease diagnosis, and medical equipment interactive control in smart hospitals: A review.

Computers in biology and medicine
The growing and aging of the world population have driven the shortage of medical resources in recent years, especially during the COVID-19 pandemic. Fortunately, the rapid development of robotics and artificial intelligence technologies help to adap...

Detecting Lombard Speech Using Deep Learning Approach.

Sensors (Basel, Switzerland)
Robust Lombard speech-in-noise detecting is challenging. This study proposes a strategy to detect Lombard speech using a machine learning approach for applications such as public address systems that work in near real time. The paper starts with the ...

Towards a simultaneously speaking bilingual robot: Primary study on the effect of gender and pitch of the robot's voice.

PloS one
With fast and reliable international transportation, more people with different language backgrounds can interact now. As a result, the need for communicative agents fluent in several languages to assist those people is highlighted. The high cost of ...

Design and Implementation of Machine Tool Life Inspection System Based on Sound Sensing.

Sensors (Basel, Switzerland)
The main causes of damage to industrial machinery are aging, corrosion, and the wear of parts, which affect the accuracy of machinery and product precision. Identifying problems early and predicting the life cycle of a machine for early maintenance c...