BACKGROUND: Prior to the COVID-19 global health emergency, reducing direct contacts between therapists and patients is an important issue, and could be achieved by using robots to perform certain caring activities.
Humans produce two forms of cognitively complex vocalizations: speech and song. It is debated whether these differ based primarily on culturally specific, learned features, or if acoustical features can reliably distinguish them. We study the spectro...
Neural networks : the official journal of the International Neural Network Society
39368276
Recently, denoising diffusion models have demonstrated remarkable performance among generative models in various domains. However, in the speech domain, there are limitations in complexity and controllability to apply diffusion models for time-varyin...
The Bel Canto performance is a complex and multidimensional art form encompassing pitch, timbre, technique, and affective expression. To accurately reflect a performer's singing proficiency, it is essential to quantify and evaluate their vocal techni...