Dynamic Inverse Reinforcement Learning for Feedback-driven Reward Estimation in Brain Machine Interface Tasks.
Journal:
Annual International Conference of the IEEE Engineering in Medicine and Biology Society. IEEE Engineering in Medicine and Biology Society. Annual International Conference
PMID:
40039912
Abstract
Reinforcement learning (RL)-based brain machine interfaces (BMIs) provide a promising solution for paralyzed people. Enhancing the decoding performance of RL-based BMIs relies on the design of effective reward signals. Inverse reinforcement learning (IRL) offers an approach to infer subjects' own evaluation from the observed behavior. However, applying IRL to extract reward information in complex BMI tasks requires consideration of the dynamics of subjects' goal during the control process. This dynamic nature of subjects' evaluation requires the IRL method to be able to estimate a time varying reward function. Previous IRL methods applied in BMI systems only estimated a static reward function. Existing IRL algorithms for dynamic reward estimation employ optimization methods to approximate the reward map for each state at each time, which demands substantial amounts of data to achieve convergence. In this paper, we propose a dynamic IRL method to estimate the feedback-driven reward of subjects during BMI tasks. We utilize a state-observation model to continuously infer the reward value for each state, with sensory feedback serving as the external input to model the transition process of the reward. We evaluate our proposed method on a simulated BMI fetch task, which is a multistep task with a time varying reward function. Our method demonstrates improved reward estimation close to the ground truth value, and it significantly outperforms the existing dynamic IRL method when the map size exceeds 25(p<0.01). These preliminary results suggests that the dynamic IRL method for feedback-driven reward estimation holds potential for improving the design of RL-based BMIs.