Sample-Efficient Reinforcement Learning Controller for Deep Brain Stimulation in Parkinson's Disease
Journal:
arXiv
Published Date:
Jul 8, 2025
Abstract
Deep brain stimulation (DBS) is an established intervention for Parkinson's
disease (PD), but conventional open-loop systems lack adaptability, are
energy-inefficient due to continuous stimulation, and provide limited
personalization to individual neural dynamics. Adaptive DBS (aDBS) offers a
closed-loop alternative, using biomarkers such as beta-band oscillations to
dynamically modulate stimulation. While reinforcement learning (RL) holds
promise for personalized aDBS control, existing methods suffer from high sample
complexity, unstable exploration in binary action spaces, and limited
deployability on resource-constrained hardware.
We propose SEA-DBS, a sample-efficient actor-critic framework that addresses
the core challenges of RL-based adaptive neurostimulation. SEA-DBS integrates a
predictive reward model to reduce reliance on real-time feedback and employs
Gumbel Softmax-based exploration for stable, differentiable policy updates in
binary action spaces. Together, these components improve sample efficiency,
exploration robustness, and compatibility with resource-constrained
neuromodulatory hardware. We evaluate SEA-DBS on a biologically realistic
simulation of Parkinsonian basal ganglia activity, demonstrating faster
convergence, stronger suppression of pathological beta-band power, and
resilience to post-training FP16 quantization. Our results show that SEA-DBS
offers a practical and effective RL-based aDBS framework for real-time,
resource-constrained neuromodulation.