Exploration in neo-Hebbian reinforcement learning: Computational approaches to the exploration-exploitation balance with bio-inspired neural networks.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Jul 1, 2022
Abstract
Recent theoretical and experimental works have connected Hebbian plasticity with the reinforcement learning (RL) paradigm, producing a class of trial-and-error learning in artificial neural networks known as neo-Hebbian plasticity. Inspired by the role of the neuromodulator dopamine in synaptic modification, neo-Hebbian RL methods extend unsupervised Hebbian learning rules with value-based modulation to selectively reinforce associations. This reinforcement allows for learning exploitative behaviors and produces RL models with strong biological plausibility. The review begins with coverage of fundamental concepts in rate- and spike-coded models. We introduce Hebbian correlation detection as a basis for modification of synaptic weighting and progress to neo-Hebbian RL models guided solely by extrinsic rewards. We then analyze state-of-the-art neo-Hebbian approaches to the exploration-exploitation balance under the RL paradigm, emphasizing works that employ additional mechanics to modulate that dynamic. Our review of neo-Hebbian RL methods in this context indicates substantial potential for novel improvements in exploratory learning, primarily through stronger incorporation of intrinsic motivators. We provide a number of research suggestions for this pursuit by drawing from modern theories and results in neuroscience and psychology. The exploration-exploitation balance is a central issue in RL research, and this review is the first to focus on it under the neo-Hebbian RL framework.