AI Medical Compendium Journal:
The Journal of the Acoustical Society of America

Showing 51 to 60 of 100 articles

Speech emotion recognition based on transfer learning from the FaceNet framework.

The Journal of the Acoustical Society of America
Speech plays an important role in human-computer emotional interaction. FaceNet used in face recognition achieves great success due to its excellent feature extraction. In this study, we adopt the FaceNet model and improve it for speech emotion recog...

Seabed type and source parameters predictions using ship spectrograms in convolutional neural networks.

The Journal of the Acoustical Society of America
Broadband spectrograms from surface ships are employed in convolutional neural networks (CNNs) to predict the seabed type, ship speed, and closest point of approach (CPA) range. Three CNN architectures of differing size and depth are trained on diffe...

Model-based convolutional neural network approach to underwater source-range estimation.

The Journal of the Acoustical Society of America
This paper is part of a special issue on machine learning in acoustics. A model-based convolutional neural network (CNN) approach is presented to test the viability of this method as an alternative to conventional matched-field processing (MFP) for u...

Highlighting interlanguage phoneme differences based on similarity matrices and convolutional neural network.

The Journal of the Acoustical Society of America
The goal of this research is to find a way of highlighting the acoustic differences between consonant phonemes of the Polish and Lithuanian languages. For this purpose, similarity matrices are employed based on speech acoustic parameters combined wit...

Source depth estimation using spectral transformations and convolutional neural network in a deep-sea environment.

The Journal of the Acoustical Society of America
Multiple approaches for depth estimation in deep-ocean environments are discussed. First, a multispectral transformation for depth estimation (MSTDE) method based on the low-spatial-frequency interference in a constant sound speed is derived to estim...

Semi-supervised audio-driven TV-news speaker diarization using deep neural embeddings.

The Journal of the Acoustical Society of America
In this paper, an audio-driven, multimodal approach for speaker diarization in multimedia content is introduced and evaluated. The proposed algorithm is based on semi-supervised clustering of audio-visual embeddings, generated using deep learning tec...

Phonetic variability constrained bottleneck features for joint speaker recognition and physical task stress detection.

The Journal of the Acoustical Society of America
Normalizing intrinsic variabilities (e.g., variability in speech production brought on by aging, physical or cognitive task stress, Lombard effect, etc.) in speech and speaker recognition models is essential for system robustness. This study focuses ...

Deep Convolutional Neural Networks for Thyroid Tumor Grading using Ultrasound B-mode Images.

The Journal of the Acoustical Society of America
The performances of deep convolutional neural network (DCNN) modeling and transfer learning (TF) for thyroid tumor grading using ultrasound imaging were evaluated. This retrospective study included input patient data (ultrasound B-mode image sets) as...

A two-stage deep learning algorithm for talker-independent speaker separation in reverberant conditions.

The Journal of the Acoustical Society of America
Speaker separation is a special case of speech separation, in which the mixture signal comprises two or more speakers. Many talker-independent speaker separation methods have been introduced in recent years to address this problem in anechoic conditi...

Polyphonic pitch tracking with deep layered learning.

The Journal of the Acoustical Society of America
This article presents a polyphonic pitch tracking system that is able to extract both framewise and note-based estimates from audio. The system uses several artificial neural networks trained individually in a deep layered learning setup. First, casc...