Towards more effective skill discovery in reinforcement learning by incorporating state reachability.
Journal:
Neural networks : the official journal of the International Neural Network Society
Published Date:
Dec 11, 2025
Abstract
Skill discovery in reinforcement learning seeks to autonomously learn a diverse repertoire of behaviors, enabling efficient adaptation to downstream tasks. While existing methods primarily maximize mutual information between skills and states to ensure diversity, they often fail to guarantee state reachability, leading to skill-unreachable regions that hinder adaptation in complex environments. To address this, we propose Skill Discovery with State Reachability (SDSR), a novel framework that explicitly integrates reachability into skill learning. SDSR enhances traditional mutual information-based methods by introducing a skill-conditioned inverse dynamics model, which learns the necessary state-action transitions to expand the agent's accessible state space, and a meta-policy optimization mechanism, which jointly optimizes skill diversity and reachability to ensure comprehensive state coverage. We implement SDSR through two complementary approaches: threshold-based selection, which leverages high-probability state-action pairs from experience for efficient skill learning in low-dimensional environments, and joint training, which optimizes reachability alongside reinforcement learning objectives, making it well-suited for high-dimensional environments. Extensive experiments in both 2D and robotic environments demonstrate that SDSR significantly improves skill diversity, enhances exploration efficiency, and accelerates adaptation to downstream tasks. By expanding the agent's accessible state space while maintaining structured skill diversity, SDSR provides a robust and generalizable foundation for reinforcement learning in complex decision-making domains.
Authors
Keywords
No keywords available for this article.