Constructing Non-Markovian Decision Process via History Aggregator
Journal:
arXiv
Published Date:
Jun 30, 2025
Abstract
In the domain of algorithmic decision-making, non-Markovian dynamics manifest
as a significant impediment, especially for paradigms such as Reinforcement
Learning (RL), thereby exerting far-reaching consequences on the advancement
and effectiveness of the associated systems. Nevertheless, the existing
benchmarks are deficient in comprehensively assessing the capacity of decision
algorithms to handle non-Markovian dynamics. To address this deficiency, we
have devised a generalized methodology grounded in category theory. Notably, we
established the category of Markov Decision Processes (MDP) and the category of
non-Markovian Decision Processes (NMDP), and proved the equivalence
relationship between them. This theoretical foundation provides a novel
perspective for understanding and addressing non-Markovian dynamics. We further
introduced non-Markovianity into decision-making problem settings via the
History Aggregator for State (HAS). With HAS, we can precisely control the
state dependency structure of decision-making problems in the time series. Our
analysis demonstrates the effectiveness of our method in representing a broad
range of non-Markovian dynamics. This approach facilitates a more rigorous and
flexible evaluation of decision algorithms by testing them in problem settings
where non-Markovian dynamics are explicitly constructed.