Machine learning approaches identify immunologic signatures of total and intact HIV DNA during long-term antiretroviral therapy.

Journal: eLife
PMID:

Abstract

Understanding the interplay between the HIV reservoir and the host immune system may yield insights into HIV persistence during antiretroviral therapy (ART) and inform strategies for a cure. Here, we applied machine learning (ML) approaches to cross-sectional high-parameter HIV reservoir and immunology data in order to characterize host-reservoir associations and generate new hypotheses about HIV reservoir biology. High-dimensional immunophenotyping, quantification of HIV-specific T cell responses, and measurement of genetically intact and total HIV proviral DNA frequencies were performed on peripheral blood samples from 115 people with HIV (PWH) on long-term ART. Analysis demonstrated that both intact and total proviral DNA frequencies were positively correlated with T cell activation and exhaustion. Years of ART and select bifunctional HIV-specific CD4 T cell responses were negatively correlated with the percentage of intact proviruses. A leave-one-covariate-out inference approach identified specific HIV reservoir and clinical-demographic parameters, such as age and biological sex, that were particularly important in predicting immunophenotypes. Overall, immune parameters were more strongly associated with total HIV proviral frequencies than intact proviral frequencies. Uniquely, however, expression of the IL-7 receptor alpha chain (CD127) on CD4 T cells was more strongly correlated with the intact reservoir. Unsupervised dimension reduction analysis identified two main clusters of PWH with distinct immune and reservoir characteristics. Using reservoir correlates identified in these initial analyses, decision tree methods were employed to visualize relationships among multiple immune and clinical-demographic parameters and the HIV reservoir. Finally, using random splits of our data as training-test sets, ML algorithms predicted with approximately 70% accuracy whether a given participant had qualitatively high or low levels of total or intact HIV DNA . The techniques described here may be useful for assessing global patterns within the increasingly high-dimensional data used in HIV reservoir and other studies of complex biology.

Authors

  • Lesia Semenova
    Microsoft Research, Duke University, Durham, United States.
  • Yingfan Wang
    Department of Computer Science, Duke University, Durham, United States.
  • Shane Falcinelli
    UNC HIV Cure Center UNC Chapel Hill, Chapel Hill, United States.
  • Nancie Archin
    UNC HIV Cure Center UNC Chapel Hill, Chapel Hill, United States.
  • Alicia D Cooper-Volkheimer
    Department of Medicine, Duke University, Durham, United States.
  • David M Margolis
    UNC HIV Cure Center UNC Chapel Hill, Chapel Hill, United States.
  • Nilu Goonetilleke
    UNC HIV Cure Center UNC Chapel Hill, Chapel Hill, United States.
  • David M Murdoch
    Department of Medicine, Duke University, Durham, United States.
  • Cynthia D Rudin
    Department of Computer Science, Duke University, Durham, United States.
  • Edward P Browne
    UNC HIV Cure Center UNC Chapel Hill, Chapel Hill, United States.