Using Large Language Models for Advanced and Flexible Labelling of Protocol Deviations in Clinical Development.

Journal: Therapeutic innovation & regulatory science
Published Date:

Abstract

BACKGROUND: As described in ICH E3 Q&A R1 (International Council for Harmonisation. E3: Structure and content of clinical study reports-questions and answers (R1). 6 July 2012. Available from: https://database.ich.org/sites/default/files/E3_Q%26As_R1_Q%26As.pdf ): "A protocol deviation (PD) is any change, divergence, or departure from the study design or procedures defined in the protocol". A problematic area in human subject protection is the wide divergence among institutions, sponsors, investigators and IRBs regarding the definition of and the procedures for reviewing PDs. Despite industry initiatives like TransCelerate's holistic approach [Galuchie et al. in Ther Innov Regul Sci 55:733-742, 2021], systematic trending and identification of impactful PDs remains limited. Traditional Natural Language Processing (NLP) methods are often cumbersome to implement, requiring extensive feature engineering and model tuning. However, the rise of Large Language Models (LLMs) has revolutionised text classification, enabling more accurate, nuanced, and context-aware solutions [Nguyen P. Test classification in the age of LLMs. 2024. Available from: https://blog.redsift.com/author/phong/ ]. An automated classification solution that enables efficient, flexible, and targeted PD classification is currently lacking.

Authors

  • Min Zou
    School of Public Health (Shenzhen), Shenzhen Campus of Sun Yat-sen University, Shenzhen 518107, P.R. China.
  • Leszek Popko
    F. Hoffmann-La Roche, Basel, Switzerland.
  • Michelle Gaudio
    Hoffmann-La Roche Limited, 7070 Mississauga Road, Mississauga, ON, L5N 5M8, Canada.