Assessing concordance between RNA-Seq and NanoString technologies in Ebola-infected nonhuman primates using machine learning.

Journal: BMC genomics
PMID:

Abstract

This study evaluates the concordance between RNA sequencing (RNA-Seq) and NanoString technologies for gene expression analysis in non-human primates (NHPs) infected with Ebola virus (EBOV). A detailed comparison of both platforms revealed a strong correlation, with Spearman coefficients for 56 out of 62 samples ranging from 0.78 to 0.88. The mean and median coefficients were 0.83 and 0.85, respectively. Bland-Altman analysis confirmed high consistency across most measurements, with values falling within the 95% limits of agreement. Using a machine learning approach with the Supervised Magnitude-Altitude Scoring (SMAS) method trained on NanoString data, OAS1 was identified as a key gene signature for distinguishing RT-qPCR positive from negative samples. Remarkably, when used as the sole predictor in a logistic regression model, OAS1 maintained its predictive power on RNA-Seq data from the same cohort of EBOV-infected NHPs, achieving 100% accuracy in distinguishing infected from non-infected samples. OAS1 was also tested in a completely independent held-out test set, consisting of human monocyte-derived dendritic cells (DC) isolated and infected with different strains of the Ebola virus: wild-type (wt), VP35m, VP24m, along with a double mutant VP35m & VP24m, and again demonstrated a 100% accuracy rate in differentiating EBOV-infected from mock-infected samples, confirming its effectiveness as a predictive marker across diverse experimental setups and virus strains. Further differential expression analysis across both platforms identified 12 common genes (including ISG15, OAS1, IFI44, IFI27, IFIT2, IFIT3, IFI44L, MX1, MX2, OAS2, RSAD2, and OASL) that showed the highest levels of statistical significance and biological relevance. Gene Ontology (GO) analysis confirmed the involvement of these genes in key immune and viral infection pathways, highlighting their importance in EBOV infection. RNA-Seq uniquely identified genes such as CASP5, USP18, and DDX60, which are important in immune regulation and antiviral defense and were not detected by NanoString, demonstrating the broader detection capabilities of RNA-Seq. This study indicates a very strong agreement between RNA-Seq and NanoString platforms in gene expression analysis, with RNA-Seq displaying broader capabilities in identifying gene signatures.

Authors

  • Mostafa Rezapour
    Center for Biomedical Informatics, Wake Forest University School of Medicine, Winston-Salem, 27104, USA.
  • Aarthi Narayanan
    Department of Biology, George Mason University, Fairfax, VA 22030, United States.
  • Wyatt H Mowery
    Department of Electrical Engineering and Computer Science, Massachusetts Institute of Technology, Cambridge, MA, 02139, USA.
  • Metin Nafi Gurcan
    Center for Biomedical Informatics Department of Internal Medicine, Wake Forest School of Medicine, Winston-Salem, NC, USA.