Slow Thinking for Sequential Recommendation
Journal:
arXiv
Published Date:
Apr 13, 2025
Abstract
To develop effective sequential recommender systems, numerous methods have
been proposed to model historical user behaviors. Despite the effectiveness,
these methods share the same fast thinking paradigm. That is, for making
recommendations, these methods typically encodes user historical interactions
to obtain user representations and directly match these representations with
candidate item representations. However, due to the limited capacity of
traditional lightweight recommendation models, this one-step inference paradigm
often leads to suboptimal performance. To tackle this issue, we present a novel
slow thinking recommendation model, named STREAM-Rec. Our approach is capable
of analyzing historical user behavior, generating a multi-step, deliberative
reasoning process, and ultimately delivering personalized recommendations. In
particular, we focus on two key challenges: (1) identifying the suitable
reasoning patterns in recommender systems, and (2) exploring how to effectively
stimulate the reasoning capabilities of traditional recommenders. To this end,
we introduce a three-stage training framework. In the first stage, the model is
pretrained on large-scale user behavior data to learn behavior patterns and
capture long-range dependencies. In the second stage, we design an iterative
inference algorithm to annotate suitable reasoning traces by progressively
refining the model predictions. This annotated data is then used to fine-tune
the model. Finally, in the third stage, we apply reinforcement learning to
further enhance the model generalization ability. Extensive experiments
validate the effectiveness of our proposed method.