Efficient inverse graphics in biological face processing.

Journal: Science advances
PMID:

Abstract

Vision not only detects and recognizes objects, but performs rich inferences about the underlying scene structure that causes the patterns of light we see. Inverting generative models, or "analysis-by-synthesis", presents a possible solution, but its mechanistic implementations have typically been too slow for online perception, and their mapping to neural circuits remains unclear. Here we present a neurally plausible efficient inverse graphics model and test it in the domain of face recognition. The model is based on a deep neural network that learns to invert a three-dimensional face graphics program in a single fast feedforward pass. It explains human behavior qualitatively and quantitatively, including the classic "hollow face" illusion, and it maps directly onto a specialized face-processing circuit in the primate brain. The model fits both behavioral and neural data better than state-of-the-art computer vision models, and suggests an interpretable reverse-engineering account of how the brain transforms images into percepts.

Authors

  • Ilker Yildirim
    Center for Brains, Minds, and Machines, MIT, Cambridge, MA 02138, United States; Department of Brain & Cognitive Science, MIT, Cambridge, MA 02138, United States. Electronic address: ilkery@mit.edu.
  • Mario Belledonne
    Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.
  • Winrich Freiwald
    The Center for Brains, Minds and Machines, MIT, Cambridge, MA, USA.
  • Josh Tenenbaum
    Department of Brain and Cognitive Sciences, MIT, Cambridge, MA, USA.