Human-like monocular depth biases in deep neural networks.

Journal: PLoS computational biology

Published Date: Aug 19, 2025

Abstract

Human depth perception from 2D images is systematically distorted, yet the nature of these distortions is not fully understood. By examining error patterns in depth estimation for both humans and deep neural networks (DNNs), which have shown remarkable abilities in monocular depth estimation, we can gain insights into constructing functional models of this human 3D vision and designing artificial models with improved interpretability. Here, we propose a comprehensive human-DNN comparison framework for a monocular depth judgment task. Using a novel human-annotated dataset of natural indoor scenes and a systematic analysis of absolute depth judgments, we investigate error patterns in both humans and DNNs. Employing exponential-affine fitting, we decompose depth estimation errors into depth compression, per-image affine transformations (including scaling, shearing, and translation), and residual errors. Our analysis reveals that human depth judgments exhibit systematic and consistent biases, including depth compression, a vertical bias (perceiving objects in the lower visual field as closer), and consistent per-image affine distortions across participants. Intriguingly, we find that DNNs with higher accuracy partially recapitulate these human biases, demonstrating greater similarity in affine parameters and residual error patterns. This suggests that these seemingly suboptimal human biases may reflect efficient, ecologically adapted strategies for depth inference from inherently ambiguous monocular images. However, while DNNs capture metric-level residual error patterns similar to humans, they fail to reproduce human-level accuracy in ordinal depth perception within the affine-invariant space. These findings underscore the importance of evaluating error patterns beyond raw accuracy, providing new insights into how humans and computational models resolve depth ambiguity. Our dataset and methodology provide a framework for evaluating the alignment between computational models and human perceptual biases, thereby advancing our understanding of visual space representation and guiding the development of models that more faithfully capture human depth perception.

Authors

Yuki Kubota

Communication Science Laboratories, NTT, Inc., Kanagawa, Japan.
Taiki Fukiage

Communication Science Laboratories, NTT, Inc., Kanagawa, Japan.

Keywords

No keywords available for this article.

External Resources

View on PubMed Access via DOI PubMed (40828862)

Human-like monocular depth biases in deep neural networks.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals

Human-like monocular depth biases in deep neural networks.

Abstract

Authors

Keywords

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals