An individualization approach for head-related transfer function in arbitrary directions based on deep learning.

Journal: JASA express letters
Published Date:

Abstract

This paper provides an individualization approach for head-related transfer function (HRTF) in arbitrary directions based on deep learning by utilizing dual-autoencoder architecture to establish the relationship between HRTF magnitude spectrum and arbitrarily given direction and anthropometric parameters. In this architecture, one variational autoencoder (VAE) is utilized to extract interpretable and exploitable features of full-space HRTF spectra, while another autoencoder (AE) is employed for feature embedding of corresponding directions and anthropometric parameters. A deep neural networks model is finally trained to establish the relationship between these representative features. Experimental results show that the proposed method outperforms state-of-the-art methods in terms of spectral distortion.

Authors

  • Dingding Yao
    Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China.
  • Jiale Zhao
    School of Computer and Information Engineering, Tianjin Chengjian University, Tianjin 300384, China.
  • Longbiao Cheng
    Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China.
  • Junfeng Li
    School of Mechanical and Electronic Engineering, Wuhan University of Technology, Wuhan, China.
  • Xiaodong Li
  • Xiaochao Guo
    Department of Radiology, Peking University First Hospital, No.8, Xishiku Street, Xicheng District, Beijing, 100034, China.
  • Yonghong Yan
    Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences, Beijing 100190, China.