HunCRC: annotated pathological slides to enhance deep learning applications in colorectal cancer screening.

Journal: Scientific data
Published Date:

Abstract

Histopathology is the gold standard method for staging and grading human tumors and provides critical information for the oncoteam's decision making. Highly-trained pathologists are needed for careful microscopic analysis of the slides produced from tissue taken from biopsy. This is a time-consuming process. A reliable decision support system would assist healthcare systems that often suffer from a shortage of pathologists. Recent advances in digital pathology allow for high-resolution digitalization of pathological slides. Digital slide scanners combined with modern computer vision models, such as convolutional neural networks, can help pathologists in their everyday work, resulting in shortened diagnosis times. In this study, 200 digital whole-slide images are published which were collected via hematoxylin-eosin stained colorectal biopsy. Alongside the whole-slide images, detailed region level annotations are also provided for ten relevant pathological classes. The 200 digital slides, after pre-processing, resulted in 101,389 patches. A single patch is a 512 × 512 pixel image, covering 248 × 248 μm tissue area. Versions at higher resolution are available as well. Hopefully, HunCRC, this widely accessible dataset will aid future colorectal cancer computer-aided diagnosis and research.

Authors

  • Bálint Ármin Pataki
    Department of Physics of Complex Systems, ELTE Eötvös Loránd University, Budapest, Hungary. patbaa@caesar.elte.hu.
  • Alex Olar
    Department of Physics of Complex Systems, ELTE, Eötvös Loránd University, Budapest, Hungary.
  • Dezső Ribli
    Department of Physics of Complex Systems, Eötvös Loránd University, Budapest, Hungary. dkrib@caesar.elte.hu.
  • Adrián Pesti
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Endre Kontsek
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Benedek Gyöngyösi
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Ágnes Bilecz
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Tekla Kovács
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Kristóf Attila Kovács
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Zsófia Kramer
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • András Kiss
    2nd Department of Pathology, Semmelweis University, Budapest, Hungary.
  • Miklós Szócska
    Health Services Management Training Centre, Semmelweis University, Budapest, Hungary.
  • Péter Pollner
    MTA-ELTE Statistical and Biological Physics Research Group, Hungarian Academy of Sciences, Budapest, Hungary.
  • István Csabai
    Department of Physics of Complex Systems, Eötvös Loránd University, Budapest, Hungary.