Learning to Predict Consequences as a Method of Knowledge Transfer in Reinforcement Learning.

Journal: IEEE transactions on neural networks and learning systems
PMID:

Abstract

The reinforcement learning (RL) paradigm allows agents to solve tasks through trial-and-error learning. To be capable of efficient, long-term learning, RL agents should be able to apply knowledge gained in the past to new tasks they may encounter in the future. The ability to predict actions' consequences may facilitate such knowledge transfer. We consider here domains where an RL agent has access to two kinds of information: agent-centric information with constant semantics across tasks, and environment-centric information, which is necessary to solve the task, but with semantics that differ between tasks. For example, in robot navigation, environment-centric information may include the robot's geographic location, while agent-centric information may include sensor readings of various nearby obstacles. We propose that these situations provide an opportunity for a very natural style of knowledge transfer, in which the agent learns to predict actions' environmental consequences using agent-centric information. These predictions contain important information about the affordances and dangers present in a novel environment, and can effectively transfer knowledge from agent-centric to environment-centric learning systems. Using several example problems including spatial navigation and network routing, we show that our knowledge transfer approach can allow faster and lower cost learning than existing alternatives.

Authors

  • Eric Chalmers
    Department of Electrical and Computer Engineering, University of Alberta, Edmonton, AB, Canada.
  • Edgar Bermudez Contreras
  • Brandon Robertson
  • Artur Luczak
  • Aaron Gruber
    University of Lethbridge, Canadian Center for Behavioural Neuroscience, 4401 University Dr. W., Lethbridge, AB, Canada T1K3M4.