Comparative Analysis of Generative Pre-Trained Transformer Models in Oncogene-Driven Non-Small Cell Lung Cancer: Introducing the Generative Artificial Intelligence Performance Score.

Journal: JCO clinical cancer informatics
PMID:

Abstract

PURPOSE: Precision oncology in non-small cell lung cancer (NSCLC) relies on biomarker testing for clinical decision making. Despite its importance, challenges like the lack of genomic oncology training, nonstandardized biomarker reporting, and a rapidly evolving treatment landscape hinder its practice. Generative artificial intelligence (AI), such as ChatGPT, offers promise for enhancing clinical decision support. Effective performance metrics are crucial to evaluate these models' accuracy and their propensity for producing incorrect or hallucinated information. We assessed various ChatGPT versions' ability to generate accurate next-generation sequencing reports and treatment recommendations for NSCLC, using a novel Generative AI Performance Score (G-PS), which considers accuracy, relevancy, and hallucinations.

Authors

  • Zacharie Hamilton
    University of Illinois Chicago, Chicago, IL.
  • Aseem Aseem
    University of Illinois Chicago, Chicago, IL.
  • Zhengjia Chen
    University of Illinois Chicago, Chicago, IL.
  • Noor Naffakh
    University of Illinois Chicago, Chicago, IL.
  • Natalie M Reizine
    University of Illinois Chicago, Chicago, IL.
  • Frank Weinberg
    University of Illinois Chicago, Chicago, IL.
  • Shikha Jain
    University of Illinois Chicago, Chicago, IL.
  • Larry G Kessler
    University of Washington, Seattle, WA.
  • Vijayakrishna K Gadi
    University of Illinois Chicago, Chicago, IL.
  • Christopher Bun
    Kirkland & Ellis, Chicago, IL.
  • Ryan H Nguyen
    University of Illinois Chicago, Chicago, IL.