NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images.

Journal: Scientific data
Published Date:

Abstract

Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, CAD systems are helpful. However, the development of these systems depends on precisely annotated datasets, which are currently limited. Although several lung imaging datasets exist, there is only few of publicly available datasets with segmentation annotations on LDCT images. To address this problem, we developed a dataset based on NLST LDCT images with pixel-level annotations of lung lesions. The dataset includes LDCT scans from 605 patients and 715 annotated lesions, including 662 lung tumors and 53 lung nodules. Lesion volumes range from 0.03 cm to 372.21 cm, with 500 lesions smaller than 5 cm, mostly located in the right upper lung. A 2D U-Net model trained on the dataset achieved a 0.95 IoU on training dataset. This dataset enhances the diversity and usability of lung cancer annotation resources.

Authors

  • Kun-Hui Chen
    Department of Orthopedic Surgery, Taichung Veterans General Hospital, Taichung, Taiwan.
  • Yi-Hui Lin
    School of Pharmacy, Kaohsiung Medical University, 100 Shihchuan 1st Rd., Kaohsiung, 80708, Taiwan.
  • Shawn Wu
    Department of Diagnostic Imaging, SY Research Institute, Dallas, USA.
  • Nai-Wen Shih
    Department of Radiation Oncology, Pingtung Veterans General Hospital, Pingtung City, Taiwan.
  • Hsing-Chen Meng
    Graduate Degree Program of AI, National Yang Ming Chiao Tung University, Taichung, Taiwan.
  • Yen-Yu Lin
    Department of Neurosurgery, Neurological Institute, Taipei Veterans General Hospital, Taipei, Taiwan.
  • Chun-Rong Huang
  • Jing-Wen Huang
    Department of Radiation Oncology, Taichung Veterans General Hospital, Taichung 407, Taiwan.