NLSTseg: A Pixel-level Lung Cancer Dataset Based on NLST LDCT Images.
Journal:
Scientific data
Published Date:
Aug 23, 2025
Abstract
Low-dose computed tomography (LDCT) is the most effective tools for early detection of lung cancer. With advancements in artificial intelligence, various Computer-Aided Diagnosis (CAD) systems are now supported in clinical practice. For radiologists dealing with a huge volume of CT scans, CAD systems are helpful. However, the development of these systems depends on precisely annotated datasets, which are currently limited. Although several lung imaging datasets exist, there is only few of publicly available datasets with segmentation annotations on LDCT images. To address this problem, we developed a dataset based on NLST LDCT images with pixel-level annotations of lung lesions. The dataset includes LDCT scans from 605 patients and 715 annotated lesions, including 662 lung tumors and 53 lung nodules. Lesion volumes range from 0.03 cm to 372.21 cm, with 500 lesions smaller than 5 cm, mostly located in the right upper lung. A 2D U-Net model trained on the dataset achieved a 0.95 IoU on training dataset. This dataset enhances the diversity and usability of lung cancer annotation resources.