Adopting a human developmental visual diet yields robust, shape-based AI vision
Journal:
arXiv
Published Date:
Jul 3, 2025
Abstract
Despite years of research and the dramatic scaling of artificial intelligence
(AI) systems, a striking misalignment between artificial and human vision
persists. Contrary to humans, AI heavily relies on texture-features rather than
shape information, lacks robustness to image distortions, remains highly
vulnerable to adversarial attacks, and struggles to recognise simple abstract
shapes within complex backgrounds. To close this gap, we here introduce a
solution that arises from a previously underexplored direction: rather than
scaling up, we take inspiration from how human vision develops from early
infancy into adulthood. We quantified the visual maturation by synthesising
decades of psychophysical and neurophysiological research into a novel
developmental visual diet (DVD) for AI vision. We show that guiding AI systems
through this human-inspired curriculum produces models that closely align with
human behaviour on every hallmark of robust vision tested yielding the
strongest reported reliance on shape information to date, abstract shape
recognition beyond the state of the art, higher robustness to image
corruptions, and stronger resilience to adversarial attacks. By outperforming
high parameter AI foundation models trained on orders of magnitude more data,
we provide evidence that robust AI vision can be achieved by guiding the way
how a model learns, not merely how much it learns, offering a
resource-efficient route toward safer and more human-like artificial visual
systems.