Dopamine drives a positive reward bias on human reinforcement learning
Journal:
bioRxiv
Published Date:
Jan 1, 2025
Abstract
Formal theories of reinforcement learning (RL) prescribe a clearly defined function for dopamine, namely modulating learning via reward prediction errors (RPEs). Yet, empirical evidence in humans remains scarce, and recent advances introducing noisy RL cast doubt on a simple one-to-one mapping between neurotransmitters and computational mechanisms. Here, we detail a double-blind, placebo-controlled, randomised pharmacological study using the dopamine precursor L-DOPA, while healthy volunteers performed a volatile two-armed bandit task. Behaviourally, L-DOPA decreased switching behaviour following below-average rewards. Algorithmic RL modelling of human behaviour supported a dual effect of L-DOPA on the rate and precision of learning. By leveraging recurrent neural networks (RNNs) as implementational models of RL, we explain this dual effect through a single inference-time modulation, whereby L-DOPA triggers a positive reward bias at the input of the recurrent layer that implements RL. Our findings highlight a unifying mechanism at the implementation level that explain seemingly disparate algorithmic effects of dopamine.