A future role for health applications of large language models depends on regulators enforcing safety standards.

Journal: The Lancet. Digital health
Published Date:

Abstract

Among the rapid integration of artificial intelligence in clinical settings, large language models (LLMs), such as Generative Pre-trained Transformer-4, have emerged as multifaceted tools that have potential for health-care delivery, diagnosis, and patient care. However, deployment of LLMs raises substantial regulatory and safety concerns. Due to their high output variability, poor inherent explainability, and the risk of so-called AI hallucinations, LLM-based health-care applications that serve a medical purpose face regulatory challenges for approval as medical devices under US and EU laws, including the recently passed EU Artificial Intelligence Act. Despite unaddressed risks for patients, including misdiagnosis and unverified medical advice, such applications are available on the market. The regulatory ambiguity surrounding these tools creates an urgent need for frameworks that accommodate their unique capabilities and limitations. Alongside the development of these frameworks, existing regulations should be enforced. If regulators fear enforcing the regulations in a market dominated by supply or development by large technology companies, the consequences of layperson harm will force belated action, damaging the potentiality of LLM-based applications for layperson medical advice.

Authors

  • Oscar Freyer
    Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany.
  • Isabella Catharina Wiest
    Else Kröner Fresenius Center for Digital Health, TUD Dresden University of Technology, Dresden, Germany; Department of Medicine, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
  • Jakob Nikolas Kather
    Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.
  • Stephen Gilbert
    Ada Health GmbH, Berlin, Germany.