Towards Superhuman Imitation Learning for Sequential Head-and-Neck Cancer Treatment Decisions

Journal: medRxiv

Published Date: Jan 1, 2025

Abstract

We propose a simulator-driven imitation learning framework for sequential decision making in head and neck cancer (HNC) treatment. Our method, Superhuman Policy Gradient Optimization (SPGO), integrates inverse reinforcement learning principles with policy gradient updates to derive three-stage treatment policies directly from recorded physician decisions. It leverages a pre-trained clinical simulator—combining a variational autoencoder and gradient boosting models—to generate complete, temporally consistent patient trajectories, enabling safe and reproducible training. Unlike conventional behavior cloning, SPGO optimizes a sub-dominance loss that explicitly rewards surpassing the expert across multiple clinical outcomes, including relapse at year three and patient-reported toxicities at multiple follow-up times. We systematically compare six subdominance configurations (absolute vs. relative, sum vs. max aggregation, per-feature vs. max-only α updates) to assess how loss design affects convergence and treatment quality. Our best configuration—relative differences with sum aggregation and per-feature α updates—achieves over 70% superhuman dominance across clinically relevant features on held-out patients. The learned policies reproduce expert decisions on acute measures while significantly reducing predicted late toxicities and relapse risk, demonstrating generalization beyond the training distribution. • Applied computing → Health informatics; • Computing methodologies → Reinforcement learning; Learning from demonstrations.

Authors

F. Corna; X. Zhang; G. Canahuate; S. Attia; A. Mohamed; M. Naser; C. Fuller; G.E. Marai

External Resources

View on medRxiv Access via DOI

Towards Superhuman Imitation Learning for Sequential Head-and-Neck Cancer Treatment Decisions

Abstract

Authors

Categories

External Resources

Popular Topics

Recent Journals

Towards Superhuman Imitation Learning for Sequential Head-and-Neck Cancer Treatment Decisions

Abstract

Authors

Categories

External Resources

Stay Ahead of Medical AI

Popular Topics

Recent Journals