Thrifty wide-context models of B cell receptor somatic hypermutation.

Journal: eLife
Published Date:

Abstract

Somatic hypermutation (SHM) is the diversity-generating process in antibody affinity maturation. Probabilistic models of SHM are needed for analyzing rare mutations, understanding the selective forces guiding affinity maturation, and understanding the underlying biochemical process. High-throughput data offers the potential to develop and fit models of SHM on relevant data sets. In this article, we model SHM using modern frameworks. We are motivated by recent work suggesting the importance of a wider context for SHM; however, assigning an independent rate to each k-mer leads to an exponential proliferation of parameters. Thus, using convolutions on 3-mer embeddings, we develop 'thrifty' models of SHM of various sizes; these can have fewer free parameters than a 5-mer model and yet have a significantly wider context. These offer a slight performance improvement over a 5-mer model, and other modern model elaborations worsen performance. We also find that a per-site effect is not necessary to explain SHM patterns given nucleotide context. Also, the two current methods for fitting an SHM model-on out-of-frame sequence data and on synonymous mutations-produce significantly different results, and augmenting out-of-frame data with synonymous mutations does not aid out-of-sample performance.

Authors

  • Kevin Sung
    Department of Medicine, University of California San Diego, La Jolla, California, USA.
  • Mackenzie M Johnson
    Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, United States.
  • Will Dumm
    Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, United States.
  • Noah Simon
    Department of Biostatistics, University of Washington, Seattle, WA, 98195, USA.
  • Hugh Haddox
    Computational Biology Program, Fred Hutchinson Cancer Center, Seattle, United States.
  • Julia Fukuyama
    Department of Statistics, Indiana University, Bloomington, United States.
  • Frederick A Matsen
    Howard Hughes Medical Institute, Seattle, United States.