Developing artificial intelligence tools for institutional review board pre-review: A pilot study on ChatGPT's accuracy and reproducibility.

Journal: PLOS digital health
Published Date:

Abstract

This pilot study is the first phase of a broader project aimed at developing an explainable artificial intelligence (AI) tool to support the ethical evaluation of Japanese-language clinical research documents. The tool is explicitly not intended to assist document drafting. We assessed the baseline performance of generative AI-Generative Pre-trained Transformer (GPT)-4 and GPT-4o-in analyzing clinical research protocols and informed consent forms (ICFs). The goal was to determine whether these models could accurately and consistently extract ethically relevant information, including the research objectives and background, research design, and participant-related risks and benefits. First, we compared the performance of GPT-4 and GPT-4o using custom agents developed via OpenAI's Custom GPT functionality (hereafter "GPTs"). Then, using GPT-4o alone, we compared outputs generated by GPTs optimized with customized Japanese prompts to those generated by standard prompts. GPT-4o achieved 80% agreement in extracting research objectives and background and 100% in extracting research design, while both models demonstrated high reproducibility across ten trials. GPTs with customized prompts produced more accurate and consistent outputs than standard prompts. This study suggests the potential utility of generative AI in pre-institutional review board (IRB) review tasks; it also provides foundational data for future validation and standardization efforts involving retrieval-augmented generation and fine-tuning. Importantly, this tool is intended not to automate ethical review but rather to support IRB decision-making. Limitations include the absence of gold standard reference data, reliance on a single evaluator, lack of convergence and inter-rater reliability analysis, and the inability of AI to substitute for in-person elements such as site visits.

Authors

  • Yasuko Fukataki
    Clinical Research and Trial Center, Juntendo University Hospital, Tokyo 113-8421, Japan.
  • Wakako Hayashi
    Clinical Research and Trial Center, Juntendo University Hospital, Tokyo 113-8421, Japan.
  • Naoki Nishimoto
  • Yoichi M Ito
    Health Data Science, Department of Social Science, Graduate School of Medicine, Hokkaido University, Sapporo, Japan.

Keywords

No keywords available for this article.