In-context learning enables multimodal large language models to classify cancer pathology images.

Journal: Nature communications
Published Date:

Abstract

Medical image classification requires labeled, task-specific datasets which are used to train deep learning networks de novo, or to fine-tune foundation models. However, this process is computationally and technically demanding. In language processing, in-context learning provides an alternative, where models learn from within prompts, bypassing the need for parameter updates. Yet, in-context learning remains underexplored in medical image analysis. Here, we systematically evaluate the model Generative Pretrained Transformer 4 with Vision capabilities (GPT-4V) on cancer image processing with in-context learning on three cancer histopathology tasks of high importance: Classification of tissue subtypes in colorectal cancer, colon polyp subtyping and breast tumor detection in lymph node sections. Our results show that in-context learning is sufficient to match or even outperform specialized neural networks trained for particular tasks, while only requiring a minimal number of samples. In summary, this study demonstrates that large vision language models trained on non-domain specific data can be applied out-of-the box to solve medical image-processing tasks in histopathology. This democratizes access of generalist AI models to medical experts without technical background especially for areas where annotated data is scarce.

Authors

  • Dyke Ferber
    Department of Medical Oncology and Internal Medicine VI, National Center for Tumor Diseases, University Hospital Heidelberg, Heidelberg, Germany.
  • Georg Wölflein
    School of Computer Science, University of St Andrews, St Andrews, UK.
  • Isabella C Wiest
    Else Kroener Fresenius Center for Digital Health, Faculty of Medicine and University Hospital Carl Gustav Carus, TUD Dresden University of Technology, Dresden, Germany; Department of Medicine II, Medical Faculty Mannheim, Heidelberg University, Mannheim, Germany.
  • Marta Ligero
  • Srividhya Sainath
    Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.
  • Narmin Ghaffari Laleh
    Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.
  • Omar S M El Nahhas
    Else Kroener Fresenius Center for Digital Health, Technical University Dresden, Dresden, Germany.
  • Gustav Müller-Franzes
    From the Department of Diagnostic and Interventional Radiology (F.K., G.M.F., L.H., P.S., S.K., E.B., M.S.H., F.P., M.Z., C.K., P.B., S.N., D.T.), Department of Medicine III (J.K., K.H.), and Clinic for Surgical Intensive Medicine and Intermediate Care (G.M.), University Hospital Aachen, Pauwelsstrasse 30, 52064 Aachen, Germany; Physics of Molecular Imaging Systems, Experimental Molecular Imaging (T.H., V.S.), and Institute of Imaging and Computer Vision (J.S.), RWTH Aachen University, Aachen, Germany; Department of Inner Medicine, Luisenhospital Aachen, Aachen, Germany (L.N.); and Ocumeda AG, Erlen, Switzerland (C.H.).
  • Dirk Jäger
    Department of Medical Oncology and Internal Medicine VI, National Center for Tumor Diseases, University Hospital Heidelberg, Heidelberg, Germany.
  • Daniel Truhn
    Department of Diagnostic and Interventional Radiology, University Hospital Düsseldorf, Düsseldorf, Germany (J.S., D.B.A., S.N.); Institute of Computer Vision and Imaging, RWTH University Aachen, Pauwelsstrasse 30, 52072 Aachen, Germany (J.S., D.M.); Department of Diagnostic and Interventional Radiology, University Hospital Aachen, Aachen, Germany (D.T., M.P., F.M., C.K., S.N.); and Faculty of Mathematics and Natural Sciences, Institute of Informatics, Heinrich Heine University Düsseldorf, Düsseldorf, Germany (S.C.).
  • Jakob Nikolas Kather
    Department of Medicine III, University Hospital RWTH Aachen, Aachen, Germany.