Multi-CALF: A Policy Combination Approach with Statistical Guarantees
Journal:
arXiv
Published Date:
May 18, 2025
Abstract
We introduce Multi-CALF, an algorithm that intelligently combines
reinforcement learning policies based on their relative value improvements. Our
approach integrates a standard RL policy with a theoretically-backed
alternative policy, inheriting formal stability guarantees while often
achieving better performance than either policy individually. We prove that our
combined policy converges to a specified goal set with known probability and
provide precise bounds on maximum deviation and convergence time. Empirical
validation on control tasks demonstrates enhanced performance while maintaining
stability guarantees.