The Puzzle of Evaluating Moral Cognition in Artificial Agents.

Journal: Cognitive science
Published Date:

Abstract

In developing artificial intelligence (AI), researchers often benchmark against human performance as a measure of progress. Is this kind of comparison possible for moral cognition? Given that human moral judgment often hinges on intangible properties like "intention" which may have no natural analog in artificial agents, it may prove difficult to design a "like-for-like" comparison between the moral behavior of artificial and human agents. What would a measure of moral behavior for both humans and AI look like? We unravel the complexity of this question by discussing examples within reinforcement learning and generative AI, and we examine how the puzzle of evaluating artificial agents' moral cognition remains open for further investigation within cognitive science.

Authors

  • Madeline G Reinecke
    Google DeepMind.
  • Yiran Mao
    Google DeepMind.
  • Markus Kunesch
    Google DeepMind.
  • Edgar A Duénez-Guzmán
    DeepMind, London EC4A 3TW, UK jzl@deepmind.com rkoster@deepmind.com vezhnick@deepmind.com duenez@deepmind.com jagapiou@deepmind.com sunehag@deepmind.com www.jzleibo.com.
  • Julia Haas
    Google DeepMind.
  • Joel Z Leibo
    DeepMind, London, UK.