Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers.

Journal: Scientific reports

PMID: 35197484

Abstract

Ovarian cancer is one of the most common gynecological malignancies, ranking third after cervical and uterine cancer. High-grade serous ovarian cancer (HGSOC) is one of the most aggressive subtype, and the late onset of its symptoms leads in most cases to an unfavourable prognosis. Current predictive algorithms used to estimate the risk of having Ovarian Cancer fail to provide sufficient sensitivity and specificity to be used widely in clinical practice. The use of additional biomarkers or parameters such as age or menopausal status to overcome these issues showed only weak improvements. It is necessary to identify novel molecular signatures and the development of new predictive algorithms able to support the diagnosis of HGSOC, and at the same time, deepen the understanding of this elusive disease, with the final goal of improving patient survival. Here, we apply a Machine Learning-based pipeline to an open-source HGSOC Proteomic dataset to develop a decision support system (DSS) that displayed high discerning ability on a dataset of HGSOC biopsies. The proposed DSS consists of a double-step feature selection and a decision tree, with the resulting output consisting of a combination of three highly discriminating proteins: TOP1, PDIA4, and OGN, that could be of interest for further clinical and experimental validation. Furthermore, we took advantage of the ranked list of proteins generated during the feature selection steps to perform a pathway analysis to provide a snapshot of the main deregulated pathways of HGSOC. The datasets used for this study are available in the Clinical Proteomic Tumor Analysis Consortium (CPTAC) data portal ( https://cptac-data-portal.georgetown.edu/ ).

Authors

Federica Farinella

Division of Clinical Pathology, Laboratori Vita s.r.l., Via Sabaudia 19, 04100, Latina, Italy.
Mario Merone

Unit of Computer Systems an Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128, Rome, Italy. m.merone@unicampus.it.
Luca Bacco

Unit of Computer Systems an Bioinformatics, Department of Engineering, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo 21, 00128, Rome, Italy.
Adriano Capirchio

Computational and Translational Neuroscience Laboratory, Institute of Cognitive Sciences and Technologies, National Research Council (CTN-ISTC-CNR), Via San Martino della Battaglia 44, 00185, Rome, Italy.
Massimo Ciccozzi

Unit of Medical Statistic and Epidemiology, Università Campus Bio-Medico di Roma, Via Alvaro del Portillo, 21, 00128, Rome, Italy.
Daniele Caligiore

Laboratory of Computational Embodied Neuroscience,Institute of Cognitive Sciences and Technologies,National Research Council of Italy,Rome,Italy.gianluca.baldassarre@istc.cnr.itvieri.santucci@istc.cnr.itemilio.cartoni@istc.cnr.itdaniele.caligiore@istc.cnr.ithttp://www.istc.cnr.it/people/http://www.istc.cnr.it/people/gianluca-baldassarrehttp://www.istc.cnr.it/people/vieri-giuliano-santuccihttp://www.istc.cnr.it/people/emilio-cartonihttp://www.istc.cnr.it/people/daniele-caligiore.

Keywords

Biomarkers, Tumor Correlation of Data Cystadenocarcinoma, Serous Databases, Factual Decision Trees Female Humans Machine Learning Ovarian Neoplasms Phenotype Prognosis Proteomics

External Resources

View on PubMed Access via DOI PubMed (35197484)

Machine Learning analysis of high-grade serous ovarian cancer proteomic dataset reveals novel candidate biomarkers.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals