Automatic Treatment Planning using Reinforcement Learning for High-dose-rate Prostate Brachytherapy
Journal:
arXiv
Published Date:
Jun 11, 2025
Abstract
Purpose: In high-dose-rate (HDR) prostate brachytherapy procedures, the
pattern of needle placement solely relies on physician experience. We
investigated the feasibility of using reinforcement learning (RL) to provide
needle positions and dwell times based on patient anatomy during pre-planning
stage. This approach would reduce procedure time and ensure consistent plan
quality. Materials and Methods: We train a RL agent to adjust the position of
one selected needle and all the dwell times on it to maximize a pre-defined
reward function after observing the environment. After adjusting, the RL agent
then moves on to the next needle, until all needles are adjusted. Multiple
rounds are played by the agent until the maximum number of rounds is reached.
Plan data from 11 prostate HDR boost patients (1 for training, and 10 for
testing) treated in our clinic were included in this study. The dosimetric
metrics and the number of used needles of RL plan were compared to those of the
clinical results (ground truth). Results: On average, RL plans and clinical
plans have very similar prostate coverage (Prostate V100) and Rectum D2cc (no
statistical significance), while RL plans have less prostate hotspot (Prostate
V150) and Urethra D20% plans with statistical significance. Moreover, RL plans
use 2 less needles than clinical plan on average. Conclusion: We present the
first study demonstrating the feasibility of using reinforcement learning to
autonomously generate clinically practical HDR prostate brachytherapy plans.
This RL-based method achieved equal or improved plan quality compared to
conventional clinical approaches while requiring fewer needles. With minimal
data requirements and strong generalizability, this approach has substantial
potential to standardize brachytherapy planning, reduce clinical variability,
and enhance patient outcomes.