Mastering the game of Go with deep neural networks and tree search.

Journal: Nature
Published Date:

Abstract

The game of Go has long been viewed as the most challenging of classic games for artificial intelligence owing to its enormous search space and the difficulty of evaluating board positions and moves. Here we introduce a new approach to computer Go that uses 'value networks' to evaluate board positions and 'policy networks' to select moves. These deep neural networks are trained by a novel combination of supervised learning from human expert games, and reinforcement learning from games of self-play. Without any lookahead search, the neural networks play Go at the level of state-of-the-art Monte Carlo tree search programs that simulate thousands of random games of self-play. We also introduce a new search algorithm that combines Monte Carlo simulation with value and policy networks. Using this search algorithm, our program AlphaGo achieved a 99.8% winning rate against other Go programs, and defeated the human European Go champion by 5 games to 0. This is the first time that a computer program has defeated a human professional player in the full-sized game of Go, a feat previously thought to be at least a decade away.

Authors

  • David Silver
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Aja Huang
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Chris J Maddison
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Arthur Guez
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Laurent Sifre
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • George van den Driessche
    DeepMind, London, EC4A 3TW, UK.
  • Julian Schrittwieser
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Ioannis Antonoglou
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Veda Panneershelvam
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Marc Lanctot
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Sander Dieleman
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Dominik Grewe
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • John Nham
    Google, 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.
  • Nal Kalchbrenner
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Ilya Sutskever
    Google, 1600 Amphitheatre Parkway, Mountain View, California 94043, USA.
  • Timothy Lillicrap
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Madeleine Leach
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Koray Kavukcuoglu
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Thore Graepel
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.
  • Demis Hassabis
    Google DeepMind, 5 New Street Square, London EC4A 3TW, UK.