Capsule networks as recurrent models of grouping and segmentation.

Journal: PLoS computational biology

Published Date: Jul 21, 2020

Abstract

Classically, visual processing is described as a cascade of local feedforward computations. Feedforward Convolutional Neural Networks (ffCNNs) have shown how powerful such models can be. However, using visual crowding as a well-controlled challenge, we previously showed that no classic model of vision, including ffCNNs, can explain human global shape processing. Here, we show that Capsule Neural Networks (CapsNets), combining ffCNNs with recurrent grouping and segmentation, solve this challenge. We also show that ffCNNs and standard recurrent CNNs do not, suggesting that the grouping and segmentation capabilities of CapsNets are crucial. Furthermore, we provide psychophysical evidence that grouping and segmentation are implemented recurrently in humans, and show that CapsNets reproduce these results well. We discuss why recurrence seems needed to implement grouping and segmentation efficiently. Together, we provide mutually reinforcing psychophysical and computational evidence that a recurrent grouping and segmentation process is essential to understand the visual system and create better models that harness global shape computations.

Authors

Adrien Doerig

Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
Lynn Schmittwilken

Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.
Bilge Sayim

Institute of Psychology, University of Bern, Bern, Switzerland.
Mauro Manassi

School of Psychology, University of Aberdeen, Scotland, United Kingdom.
Michael H Herzog

Laboratory of Psychophysics, Brain Mind Institute, École Polytechnique Fédérale de Lausanne (EPFL), Lausanne, Switzerland.

Keywords

Algorithms Computational Biology Computer Simulation Female Humans Image Processing, Computer-Assisted Male Models, Biological Neural Networks, Computer Normal Distribution Pattern Recognition, Visual Reproducibility of Results Vision, Ocular

External Resources

View on PubMed Access via DOI PubMed (32692780)

Capsule networks as recurrent models of grouping and segmentation.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals