Policy gradient optimization of controllers for natural dynamic mono-pedal gait.
Journal:
Bioinspiration & biomimetics
Published Date:
Mar 25, 2020
Abstract
We have previously suggested a biologically-inspired natural dynamic controller for biped locomotion, which applies torque pulses to the different joints at particular phases of an internal phase variable. The parameters of the controller, including the timing and magnitude of the torque pulses and the dynamics of the phase variable, can be kept constant in open loop or adapted to the environment in closed loop. Here we demonstrate the implementation of this approach to a mono-ped robot and the optimization of the controller parameters to enhance robustness via policy gradient. Policy gradient was applied in simulations rather than the actual robot due to safety and hardware considerations. A grounded action transformation (GAT) was learned and used to facilitate the transfer of the learned policy from simulation to hardware. We demonstrate how GAT improves the match between simulations and experiments and how learning enhances the performance and robustness of the mono-ped robot.