A Reinforcement-Learning-Enhanced LLM Framework for Automated A/B Testing in Personalized Marketing
Journal:
arXiv
Published Date:
May 27, 2025
Abstract
For personalized marketing, a new challenge of how to effectively algorithm
the A/B testing to maximize user response is urgently to be overcome. In this
paper, we present a new approach, the RL-LLM-AB test framework, for using
reinforcement learning strategy optimization combined with LLM to automate and
personalize A/B tests. The RL-LLM-AB test is built upon the pre-trained
instruction-tuned language model. It first generates A/B versions of candidate
content variants using a Prompt-Conditioned Generator, and then dynamically
embeds and fuses the user portrait and the context of the current query with
the multi-modal perception module to constitute the current interaction state.
The content version is then selected in real-time through the policy
optimization module with an Actor-Critic structure, and long-term revenue is
estimated according to real-time feedback (such as click-through rate and
conversion rate). Furthermore, a Memory-Augmented Reward Estimator is embedded
into the framework to capture long-term user preference drift, which helps to
generalize policy across multiple users and content contexts. Numerical results
demonstrate the superiority of our proposed RL-LLM-ABTest over existing A/B
testing methods, including classical A/B testing, Contextual Bandits, and
benchmark reinforcement learning approaches on real-world marketing data.