Automating the optimization of proton PBS treatment planning for head and neck cancers using policy gradient-based deep reinforcement learning.

Journal: Medical physics
PMID:

Abstract

BACKGROUND: Proton pencil beam scanning (PBS) treatment planning for head and neck (H&N) cancers is a time-consuming and experience-demanding task where a large number of potentially conflicting planning objectives are involved. Deep reinforcement learning (DRL) has recently been introduced to the planning processes of intensity-modulated radiation therapy (IMRT) and brachytherapy for prostate, lung, and cervical cancers. However, existing DRL planning models are built upon the Q-learning framework and rely on weighted linear combinations of clinical metrics for reward calculation. These approaches suffer from poor scalability and flexibility, that is, they are only capable of adjusting a limited number of planning objectives in discrete action spaces and therefore fail to generalize to more complex planning problems.

Authors

  • Qingqing Wang
  • Chang Chang
    Department of Radiation Medicine and Applied Sciences, University of California at San Diego, La Jolla, California, USA.