Towards Perception-Informed Latent HRTF Representations
Journal:
arXiv
Published Date:
Jul 3, 2025
Abstract
Personalized head-related transfer functions (HRTFs) are essential for
ensuring a realistic auditory experience over headphones, because they take
into account individual anatomical differences that affect listening. Most
machine learning approaches to HRTF personalization rely on a learned
low-dimensional latent space to generate or select custom HRTFs for a listener.
However, these latent representations are typically learned in a manner that
optimizes for spectral reconstruction but not for perceptual compatibility,
meaning they may not necessarily align with perceptual distance. In this work,
we first study whether traditionally learned HRTF representations are well
correlated with perceptual relations using auditory-based objective perceptual
metrics; we then propose a method for explicitly embedding HRTFs into a
perception-informed latent space, leveraging a metric-based loss function and
supervision via Metric Multidimensional Scaling (MMDS). Finally, we demonstrate
the applicability of these learned representations to the task of HRTF
personalization. We suggest that our method has the potential to render
personalized spatial audio, leading to an improved listening experience.