Reinforcement learning-driven automated head and neck simultaneous integrated boost (SIB) radiation therapy: flexible treatment planning aligned with clinical preferences.
Journal:
Physics in medicine and biology
PMID:
40209749
Abstract
Head-and-neck simultaneous integrated boost (SIB) treatment planning using intensity modulated radiation therapy is particularly challenging due to the proximity to organs-at-risk. Depending on the specific clinical conditions, different parotid-sparing strategies are utilized to preserve parotid function without compromising local tumor control. Clinically this is typically done with attending's directive or via trial-and-error comparison with different sparing tradeoffs. To streamline this process, we proposed a deep reinforcement learning (DRL)-based framework that automatically generates treatment plans with flexibility to adapt to clinical preferences.A preference-encoded DRL (PEDRL) framework was developed to self-interact with the clinical treatment planning system and dynamically adjust objective constraints in the inverse optimization space. It was powered by the discrete soft actor-critic algorithm with a multi-layer perceptron architecture. The agent interprets intermediate plan status and iteratively modifies objective constraint values in a human-like fashion. By encoding parotid-sparing preferences within the state space, the agent autonomously adapts the sparing strategy to achieve optimal plan quality based on clinical priorities. The agent was trained through iterative treatment plan generation using 40 cases and subsequently tested on additional 44 patients, with generated plans compared to clinical plans.The PEDRL-generated plans demonstrated comparable performance across all dosimetric evaluation metrics for both bilateral and unilateral sparing cases in the test set. For bilateral cases, the mean value of the parotid median dose was 18.82 Gy (left) and 19.61 Gy (right), compared to 19.31 Gy (left) and 19.12 Gy (right) in the clinical plans. In unilateral sparing cases, the mean value of the spared parotid median dose was 19.92 Gy in the PEDRL-generated plans, compared to 17.16 Gy in the clinical plansThe proposed novel automated treatment planning framework efficiently generates SIB treatment plans tailored to clinical preferences, demonstrating both effectiveness and adaptability.