Reinforcement learning-driven automated head and neck simultaneous integrated boost (SIB) radiation therapy: flexible treatment planning aligned with clinical preferences.

Journal: Physics in medicine and biology
PMID:

Abstract

Head-and-neck simultaneous integrated boost (SIB) treatment planning using intensity modulated radiation therapy is particularly challenging due to the proximity to organs-at-risk. Depending on the specific clinical conditions, different parotid-sparing strategies are utilized to preserve parotid function without compromising local tumor control. Clinically this is typically done with attending's directive or via trial-and-error comparison with different sparing tradeoffs. To streamline this process, we proposed a deep reinforcement learning (DRL)-based framework that automatically generates treatment plans with flexibility to adapt to clinical preferences.A preference-encoded DRL (PEDRL) framework was developed to self-interact with the clinical treatment planning system and dynamically adjust objective constraints in the inverse optimization space. It was powered by the discrete soft actor-critic algorithm with a multi-layer perceptron architecture. The agent interprets intermediate plan status and iteratively modifies objective constraint values in a human-like fashion. By encoding parotid-sparing preferences within the state space, the agent autonomously adapts the sparing strategy to achieve optimal plan quality based on clinical priorities. The agent was trained through iterative treatment plan generation using 40 cases and subsequently tested on additional 44 patients, with generated plans compared to clinical plans.The PEDRL-generated plans demonstrated comparable performance across all dosimetric evaluation metrics for both bilateral and unilateral sparing cases in the test set. For bilateral cases, the mean value of the parotid median dose was 18.82 Gy (left) and 19.61 Gy (right), compared to 19.31 Gy (left) and 19.12 Gy (right) in the clinical plans. In unilateral sparing cases, the mean value of the spared parotid median dose was 19.92 Gy in the PEDRL-generated plans, compared to 17.16 Gy in the clinical plansThe proposed novel automated treatment planning framework efficiently generates SIB treatment plans tailored to clinical preferences, demonstrating both effectiveness and adaptability.

Authors

  • Dongrong Yang
    Department of Urology, The Second Affiliated Hospital of Soochow University, Suzhou, China.
  • Xin Wu
    Key Laboratory of Engineering Biology for Low-Carbon Manufacturing, Tianjin Institute of Industrial Biotechnology, Chinese Academy of Sciences, National Center of Technology Innovation for Synthetic Biology, No. 32, Xiqi Road, Tianjin Airport Economic Park, Tianjin 300308, China. Electronic address: wuxin@tib.cas.cn.
  • Xinyi Li
    Department of Radiation Oncology, Duke University Medical Center, Durham, NC, United States.
  • Yibo Xie
    Information Center, Beijing Hospital, National Center of Gerontology, National Health Commission, Institute of Geriatric Medicine, Chinese Academy of Medical Sciences, Beijing, China.
  • Qiuwen Wu
    Department of Radiation Oncology, Duke University Medical Center, Durham, NC, United States.
  • Q Jackie Wu
    Department of Radiation Oncology, Duke University Medical Center, Durham, NC, United States.
  • Yang Sheng
    Department of Radiation Oncology, Duke University Medical Center, Durham, NC, United States.