A fusocelular skin dataset with whole slide images for deep learning models.

Journal: Scientific data
PMID:

Abstract

Cutaneous spindle cell (CSC) lesions encompass a spectrum from benign to malignant neoplasms, often posing significant diagnostic challenges. Computer-aided diagnosis systems offer a promising solution to make pathologists' decisions objective and faster. These systems usually require large-scale datasets with curated labels for effective training; however, manual annotation is time-consuming and expensive. To overcome this challenge, crowdsourcing has emerged as a popular and valuable strategy to scale up the labeling process by distributing the effort among different non-expert annotators. This work introduces AI4SkIN, the first public dataset Whole Slide Images (WSIs) for CSC neoplasms, annotated using an innovative crowdsourcing protocol. AI4SkIN dataset contains 641 Hematoxylin and Eosin stained WSIs with multiclass labels from both expert and trainee pathologists. The dataset improves CSC neoplasm diagnosis using advanced machine learning and crowdsourcing based on Gaussian Processes, showing that models trained on non-expert labels perform comparably to those using expert labels. In conclusion, we illustrate that AI4SkIN provides a good resource for developing and validating methods for multiclass CSC neoplasm classification.

Authors

  • Rocío Del Amor
    Instituto Universitario de Investigación en Tecnología Centrada en el Ser Humano (HUMAN-Tech), Universitat Politècnica de València, 46022, Spain.
  • Miguel López-Pérez
    Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain.
  • Pablo Meseguer
    Instituto de Investigación e Innovación en Bioingeniería, HUMAN-tech, Universitat Politècnica de València, València, Spain; Valencian Graduate School and Research Network of Artificial Intelligence, Valencia, Spain.
  • Sandra Morales
    Instituto de Investigación e Innovación en Bioingeniería, I3B, Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain.
  • Liria Terradez
    Anatomical Pathology Service, University Clinical Hospital of Valencia, Spain.
  • José Aneiros-Fernández
    Department of R&D, HT Médica, San Juan de Dios Hospital, Córdoba, Spain; Pathology Unit, Azienda Sanitaria Provinciale Catania, Gravina Hospital, Caltagirone, Italy.
  • Javier Mateos
    Department of Computer Science and Artificial Intelligence, University of Granada, Granada, Spain.
  • Rafael Molina
    Department of Computer Science and Artificial Intelligence, University of Granada, 18071, Granada, Spain.
  • Valery Naranjo
    Instituto de Investigación e Innovación en Bioingeniería, I3B, Universitat Politècnica de València, Camino de Vera s/n, 46022, Valencia, Spain.