Placing language in an integrated understanding system: Next steps toward human-level performance in neural language models.

Journal: Proceedings of the National Academy of Sciences of the United States of America
PMID:

Abstract

Language is crucial for human intelligence, but what exactly is its role? We take language to be a part of a system for understanding and communicating about situations. In humans, these abilities emerge gradually from experience and depend on domain-general principles of biological neural networks: connection-based learning, distributed representation, and context-sensitive, mutual constraint satisfaction-based processing. Current artificial language processing systems rely on the same domain general principles, embodied in artificial neural networks. Indeed, recent progress in this field depends on query-based attention, which extends the ability of these systems to exploit context and has contributed to remarkable breakthroughs. Nevertheless, most current models focus exclusively on language-internal tasks, limiting their ability to perform tasks that depend on understanding situations. These systems also lack memory for the contents of prior situations outside of a fixed contextual span. We describe the organization of the brain's distributed understanding system, which includes a fast learning system that addresses the memory problem. We sketch a framework for future models of understanding drawing equally on cognitive neuroscience and artificial intelligence and exploiting query-based attention. We highlight relevant current directions and consider further developments needed to fully capture human-level language understanding in a computational system.

Authors

  • James L McClelland
    Department of Psychology, Stanford University, Stanford, CA, USA.
  • Felix Hill
    DeepMind, LondonN1C4AG, United Kingdom. adamsantoro@google.com felixhill@google.com barrettdavid@google.com draposo@google.com botvinick@google.com countzero@google.com www.deepmind.com.
  • Maja Rudolph
    Bosch Center for Artificial Intelligence, Renningen 71272, Germany; jlmcc@stanford.edu felixhill@google.com majarita.rudolph@de.bosch.com jasonbaldridge@google.com inquiries@cislmu.org.
  • Jason Baldridge
    Google Research, Austin, TX 78701; jlmcc@stanford.edu felixhill@google.com majarita.rudolph@de.bosch.com jasonbaldridge@google.com inquiries@cislmu.org.
  • Hinrich Schütze
    Center for Information and Language Processing, Ludwig Maximilian University of Munich, Munich 80538, Germany.