Machine learning and bioinformatics models to identify gene expression patterns of ovarian cancer associated with disease progression and mortality.

Journal: Journal of biomedical informatics
Published Date:

Abstract

Ovarian cancer (OC) is a common cause of cancer death among women worldwide, so there is a pressing need to identify factors influencing OC mortality. Much OC patient clinical data is publicly accessible via the Broad Institute Cancer Genome Atlas (TCGA) datasets which include patient age, cancer site, stage and subtype and patient survival, as well as OC gene transcription profiles. These allow studies correlating OC patient survival (and other clinical variables) with gene expression to identify new OC biomarkers to predict patient mortality. We integrated clinical and tissue transcriptome data from patients available from the TCGA portal. We determined OC mRNA expression levels (compared to normal ovarian tissue) of 41 genes already implicated in OC progression, and assessed how their OC tissue expression levels predicts patient survival. We employed Cox Proportional Hazard regression models to analyse clinical factors and transcriptomic information to determine the relative effects on survival that is associated with each factor. Multivariate analysis of combined data (clinical and gene mRNA expression) found age and ovary tumour site significantly correlated with patient survival. The univariate analysis also confirmed significant differences in patient survival time when altered transcription levels of TLR4, BSCL2, CDH1, ERBB2, and SCGB2A1 were evident, while multivariate analysis that considered the 41 genes simultaneously revealed a significant relationship of survival with TLR4, BSCL2, CDH1, ERBB2 and PTPRE genes. However, analyses that considered all 41 genes with clinical variables together identified genes TLR4, BSCL2, CDH1, ERBB2, BRCA2 and SCGB2A1 as independently related to survival in OC. These studies indicate that the latter genes influence OC patient survival, i.e., expression levels of these genes provide mechanistic and predictive information in addition to that of the clinical traits. Our study provides strong evidence that these genes are important prognostic indicators of patient survival that give clues to biological processes that underlie OC progression and mortality.

Authors

  • Md Ali Hossain
    Dept of CSE, Manarat International University, Dhaka 1212, Bangladesh; Dept of CSE, Jahangirnagar University, Savar, Dhaka, Bangladesh.
  • Sheikh Muhammad Saiful Islam
    Dept. of Pharmacy, Manarat International University, Dhaka 1212, Bangladesh.
  • Julian M W Quinn
    Bone Biology Divisions, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia.
  • Fazlul Huq
    The University of Sydney, School of Medical Sciences, Faculty of Medicine & Health, NSW 2006, Australia.
  • Mohammad Ali Moni
    Bone Biology Divisions, Garvan Institute of Medical Research, Sydney, NSW 2010, Australia; The University of Sydney, School of Medical Sciences, Faculty of Medicine & Health, NSW 2006, Australia. Electronic address: mohammad.moni@sydney.edu.au.