Deep learning for computer-aided abnormalities classification in digital mammogram: A data-centric perspective.

Journal: Current problems in diagnostic radiology
Published Date:

Abstract

Breast cancer is the most common type of cancer in women, and early abnormality detection using mammography can significantly improve breast cancer survival rates. Diverse datasets are required to improve the training and validation of deep learning (DL) systems for autonomous breast cancer diagnosis. However, only a small number of mammography datasets are publicly available. This constraint has created challenges when comparing different DL models using the same dataset. The primary contribution of this study is the comprehensive description of a selection of currently available public mammography datasets. The information available on publicly accessible datasets is summarized and their usability reviewed to enable more effective models to be developed for breast cancer detection and to improve understanding of existing models trained using these datasets. This study aims to bridge the existing knowledge gap by offering researchers and practitioners a valuable resource to develop and assess DL models in breast cancer diagnosis.

Authors

  • Vineela Nalla
    Department of Information Technology, Kennesaw State University, Kennesaw, Georgia, USA.
  • Seyedamin Pouriyeh
    Department of Information Technology, Kennesaw State University, Marietta, GA, USA. Electronic address: spouriye@kennesaw.edu.
  • Reza M Parizi
    Department of Software Engineering and Game Development, Kennesaw State University, Marietta, GA, USA. Electronic address: rparizi1@kennesaw.edu.
  • Hari Trivedi
    Department of Radiology, Medical College of Georgia at Augusta University, 1120 15th St, Augusta, GA 30912 (Y.T.); and Department of Radiology, Emory University, Atlanta, Ga (B.V., E.K., A.P., J.G., N.S., H.T.).
  • Quan Z Sheng
    Department of Computing, Macquarie University, Sydney, Australia. Electronic address: michael.sheng@mq.edu.au.
  • Inchan Hwang
    School of Data Science and Analytics, Kennesaw State University, Kennesaw, Georgia, USA.
  • Laleh Seyyed-Kalantari
    Computer Science, University of Toronto, Toronto, Ontario, Canada2Vector Institute, Toronto, Ontario, Canada* Corresponding author, laleh@cs.toronto.edu.
  • MinJae Woo
    Department of Public Health Sciences, Clemson University, 501 Edwards Hall, Clemson, SC, 29634, USA.