Policy search in continuous action domains: An overview.

Journal: Neural networks : the official journal of the International Neural Network Society
Published Date:

Abstract

Continuous action policy search is currently the focus of intensive research, driven both by the recent success of deep reinforcement learning algorithms and the emergence of competitors based on evolutionary algorithms. In this paper, we present a broad survey of policy search methods, providing a unified perspective on very different approaches, including also Bayesian Optimization and directed exploration methods. The main message of this overview is in the relationship between the families of methods, but we also outline some factors underlying sample efficiency properties of the various approaches.

Authors

  • Olivier Sigaud
    Sorbonne Universités, UPMC Univ Paris 06, UMR 7222, F-75005 Paris, France. Electronic address: olivier.sigaud@isir.upmc.fr.
  • Freek Stulp
    Unité d'Informatique et d'Ingénierie des Systèmes, ENSTA ParisTech, Université Paris-Saclay, 828 bd des Maréchaux, 91762 Palaiseau cedex, France; FLOWERS Research Team, INRIA, Bordeaux, France. Electronic address: freek.stulp@ensta-paristech.fr.