COVID-19CT+: A public dataset of CT images for COVID-19 retrospective analysis.

Journal: Journal of X-ray science and technology
Published Date:

Abstract

Background and objectiveCOVID-19 is considered as the biggest global health disaster in the 21st century, and it has a huge impact on the world.MethodsThis paper publishes a publicly available dataset of CT images of multiple types of pneumonia (COVID-19CT+). Specifically, the dataset contains 409,619 CT images of 1333 patients, with subset-A containing 312 community-acquired pneumonia cases and subset-B containing 1021 COVID-19 cases. In order to demonstrate that there are differences in the methods used to classify COVID-19CT+ images across time, we selected 13 classical machine learning classifiers and 5 deep learning classifiers to test the image classification task.ResultsIn this study, two sets of experiments are conducted using traditional machine learning and deep learning methods, the first set of experiments is the classification of COVID-19 in Subset-B versus COVID-19 white lung disease, and the second set of experiments is the classification of community-acquired pneumonia in Subset-A versus COVID-19 in Subset-B, demonstrating that the different periods of the methods differed on COVID-19CT+. On the first set of experiments, the accuracy of traditional machine learning reaches a maximum of 97.3% and a minimum of only 62.6%. Deep learning algorithms reaches a maximum of 97.9% and a minimum of 85.7%. On the second set of experiments, traditional machine learning reaches a high of 94.6% accuracy and a low of 56.8%. The deep learning algorithm reaches a high of 91.9% and a low of 86.3%.ConclusionsThe COVID-19CT+ in this study covers a large number of CT images of patients with COVID-19 and community-acquired pneumonia and is one of the largest datasets available. We expect that this dataset will attract more researchers to participate in exploring new automated diagnostic algorithms to contribute to the improvement of the diagnostic accuracy and efficiency of COVID-19.

Authors

  • Yihao Sun
    College of Computer Science, Sichuan University, Chengdu, China.
  • Tianming Du
  • Bin Wang
    State Key Laboratory of Soil Erosion and Dryland Farming on the Loess Plateau, Northwest A&F University, Yangling 712100, China; New South Wales Department of Primary Industries, Wagga Wagga Agricultural Institute, Wagga Wagga 2650, Australia. Electronic address: bin.a.wang@dpi.nsw.gov.au.
  • Md Mamunur Rahaman
    Microscopic Image and Medical Image Analysis Group, College of Medicine and Biological Information Engineering, Northeastern University, Shenyang, China.
  • Xinghao Wang
    Department of Radiology, Beijing Friendship Hospital, Capital Medical University, Beijing, People's Republic of China.
  • Xinyu Huang
    Institute of Medical Informatics, University of Lübeck, Ratzeburger Allee 160, Lübeck 23538, Germany. Electronic address: huang@imi.uni-luebeck.de.
  • Tao Jiang
    Department of Respiratory and Critical Care Medicine, Center for Respiratory Medicine, the Fourth Affiliated Hospital of School of Medicine, and International School of Medicine, International Institutes of Medicine, Zhejiang University, Yiwu, China.
  • Marcin Grzegorzek
    Institute for Vision and Graphics, University of Siegen, Hoerlindstr. 3, 57076 Siegen, Germany.
  • Hongzan Sun
    Shengjing Hospital, China Medical University, Shenyang, 110001, China.
  • Jian Xu
    Department of Cardiology, Lishui Central Hospital and the Fifth Affiliated Hospital of Wenzhou Medical University, Lishui, China.
  • Chen Li
    School of Computer Science and Technology, Xi'an Jiaotong University, Xi'an, Shaanxi, China.

Keywords

No keywords available for this article.