Toward Non-Invasive Voice Restoration: A Deep Learning Approach Using Real-Time MRI

Journal: medRxiv

Published Date: Jan 1, 2025

Abstract

Despite recent advances in brain–computer interfaces (BCIs) for speech restoration, existing systems remain invasive, costly, and inaccessible to individuals with congenital mutism or neurodegenerative disease. We present a proof-of-concept pipeline that synthesizes personalized speech directly from real-time magnetic resonance imaging (rtMRI) of the vocal tract, without requiring acoustic input. Segmented rtMRI frames are mapped to articulatory class representations using a Pix2Pix conditional GAN, which are then transformed into synthetic audio waveforms by a convolutional neural network modeling the articulatory-to-acoustic relationship. The outputs are rendered into audible form and evaluated with speaker-similarity metrics derived from Resemblyzer embeddings. While preliminary, our results suggest that even silent articulatory motion encodes sufficient information to approximate a speaker’s vocal characteristics, offering a non-invasive direction for future speech restoration in individuals who have lost or never developed voice.

Authors

Mohamad Saleh

External Resources

View on medRxiv Access via DOI

Toward Non-Invasive Voice Restoration: A Deep Learning Approach Using Real-Time MRI

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Toward Non-Invasive Voice Restoration: A Deep Learning Approach Using Real-Time MRI

Abstract

Authors

Categories

External Resources

Don't Miss the Future of Medicine

Popular Topics

Recent Journals