A High Magnifications Histopathology Image Dataset for Oral Squamous Cell Carcinoma Diagnosis and Prognosis.

Journal: Scientific data
Published Date:

Abstract

Oral Squamous Cell Carcinoma (OSCC) is a prevalent and aggressive malignancy where deep learning-based computer-aided diagnosis and prognosis can enhance clinical assessments. However, existing publicly available OSCC datasets often suffer from limited patient cohorts and a restricted focus on either diagnostic or prognostic tasks, limiting the development of comprehensive and generalizable models. To bridge this gap, we introduce Multi-OSCC, a new histopathology image dataset comprising 1,325 OSCC patients, integrating both diagnostic and prognostic information to expand existing public resources. Each patient is represented by six high resolution histopathology images captured at ×200, ×400, and ×1000-two per magnification-covering both the core and edge tumor regions. The Multi-OSCC dataset is richly annotated for six critical clinical tasks: recurrence prediction (REC), lymph node metastasis (LNM), tumor differentiation (TD), tumor invasion (TI), cancer embolus (CE), and perineural invasion (PI). We systematically evaluate the impact of different visual encoders, multi-image fusion techniques, stain normalization, and multi-task learning frameworks to benchmark this dataset. To accelerate future research, we publicly release the Multi-OSCC dataset at: https://github.com/guanjinquan/OSCC-PathologyImageDataset.

Authors

Keywords

No keywords available for this article.