A unified beamforming and source separation model for static and dynamic human-robot interaction.

Journal: JASA express letters
Published Date:

Abstract

This paper presents a unified model for combining beamforming and blind source separation (BSS). The validity of the model's assumptions is confirmed by recovering target speech information in noise accurately using Oracle information. Using real static human-robot interaction (HRI) data, the proposed combination of BSS with the minimum-variance distortionless response beamformer provides a greater signal-to-noise ratio (SNR) than previous parallel and cascade systems that combine BSS and beamforming. In the difficult-to-model HRI dynamic environment, the system provides a SNR gain that was 2.8 dB greater than the results obtained with the cascade combination, where the parallel combination is infeasible.

Authors

  • Jorge Wuth
    Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago 8370451, Chile.
  • Rodrigo Mahu
    Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago 8370451, Chile.
  • Israel Cohen
    Department of Diagnostic Imaging, Chaim Sheba Medical Center, Emek Haela St. 1, 52621, Ramat Gan, Israel.
  • Richard M Stern
    Department of Electrical and Computer Engineering, Carnegie Mellon University, Pittsburgh, PA 15213, USA.
  • Néstor Becerra Yoma
    Speech Processing and Transmission Laboratory, Electrical Engineering Department, University of Chile, Santiago 8370451, Chile.