A unified acoustic-to-speech-to-language embedding space captures the neural basis of natural language processing in everyday conversations.

Journal: Nature human behaviour
Published Date:

Abstract

This study introduces a unified computational framework connecting acoustic, speech and word-level linguistic structures to study the neural basis of everyday conversations in the human brain. We used electrocorticography to record neural signals across 100 h of speech production and comprehension as participants engaged in open-ended real-life conversations. We extracted low-level acoustic, mid-level speech and contextual word embeddings from a multimodal speech-to-text model (Whisper). We developed encoding models that linearly map these embeddings onto brain activity during speech production and comprehension. Remarkably, this model accurately predicts neural activity at each level of the language processing hierarchy across hours of new conversations not used in training the model. The internal processing hierarchy in the model is aligned with the cortical hierarchy for speech and language processing, where sensory and motor regions better align with the model's speech embeddings, and higher-level language areas better align with the model's language embeddings. The Whisper model captures the temporal sequence of language-to-speech encoding before word articulation (speech production) and speech-to-language encoding post articulation (speech comprehension). The embeddings learned by this model outperform symbolic models in capturing neural activity supporting natural speech and language. These findings support a paradigm shift towards unified computational models that capture the entire processing hierarchy for speech comprehension and production in real-world conversations.

Authors

  • Ariel Goldstein
    Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Haocheng Wang
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Leonard Niekerken
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Mariano Schain
    Google Research, Mountain View, CA, USA.
  • Zaid Zada
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Bobbi Aubrey
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Tom Sheffer
    Google Research, Mountain View, CA, USA.
  • Samuel A Nastase
    Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Harshvardhan Gazula
    Tri-institutional Center for Translational Research in Neuroimaging and Data Science (TReNDS), Georgia State University, Georgia Institute of Technology, Emory University, Atlanta, GA, United States.
  • Aditi Singh
  • Aditi Rao
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Gina Choe
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Catherine Kim
    Department of Psychology and the Princeton Neuroscience Institute, Princeton University, Princeton, NJ, USA.
  • Werner Doyle
    Comprehensive Epilepsy Center, Department of Neurology, School of Medicine, New York University, New York, USA.
  • Daniel Friedman
    Duke Clinical Research Institute, Durham, North Carolina; Division of Cardiology, Department of Medicine, Duke University School of Medicine, Durham, North Carolina.
  • Sasha Devore
    New York University School of Medicine, New York, NY, USA.
  • Patricia Dugan
    Department of Neurology, NYU School of Medicine, New York, NY 10016, United States of America.
  • Avinatan Hassidim
    Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Michael Brenner
    Google Research, Mountain View, CA, USA.
  • Yossi Matias
    Google Research, Google LLC, 1600 Amphitheatre Parkway, Mountain View, CA, USA.
  • Orrin Devinsky
    Comprehensive Epilepsy Center, Department of Neurology, School of Medicine, New York University, New York, USA.
  • Adeen Flinker
    New York University School of Medicine, New York, NY, USA.
  • Uri Hasson
    Princeton University, United States.