New deep learning-based methods for visualizing ecosystem properties using environmental DNA metabarcoding data.

Journal: Molecular ecology resources
Published Date:

Abstract

Environmental DNA (eDNA) metabarcoding provides an efficient approach for documenting biodiversity patterns in marine and terrestrial ecosystems. The complexity of these data prevents current methods from extracting and analyzing all the relevant ecological information they contain, and new methods may provide better dimensionality reduction and clustering. Here we present two new deep learning-based methods that combine different types of neural networks (NNs) to ordinate eDNA samples and visualize ecosystem properties in a two-dimensional space: the first is based on variational autoencoders and the second on deep metric learning. The strength of our new methods lies in the combination of two inputs: the number of sequences found for each molecular operational taxonomic unit (MOTU) detected and their corresponding nucleotide sequence. Using three different datasets, we show that our methods accurately represent several biodiversity indicators in a two-dimensional latent space: MOTU richness per sample, sequence α-diversity per sample, Jaccard's and sequence β-diversity between samples. We show that our nonlinear methods are better at extracting features from eDNA datasets while avoiding the major biases associated with eDNA. Our methods outperform traditional dimension reduction methods such as Principal Component Analysis, t-distributed Stochastic Neighbour Embedding, Nonmetric Multidimensional Scaling and Uniform Manifold Approximation and Projection for dimension reduction. Our results suggest that NNs provide a more efficient way of extracting structure from eDNA metabarcoding data, thereby improving their ecological interpretation and thus biodiversity monitoring.

Authors

  • Letizia Lamperti
    CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France.
  • Théophile Sanchez
    Laboratoire de Recherche en Informatique, CNRS UMR 8623, Université Paris-Saclay, Orsay, France.
  • Sara Si Moussi
    Laboratoire d'Ecologie Alpine, Univ. Grenoble Alpes, Univ. Savoie MontBlanc, CNRS, Grenoble, France.
  • David Mouillot
    MARBEC, Univ Montpellier, CNRS, Ifremer, IRD, Montpellier, France.
  • Camille Albouy
    Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland.
  • Benjamin Flück
    Ecosystems and Landscape Evolution, Department of Environmental Systems Science, ETH Zürich, Zürich, Switzerland.
  • Morgane Bruno
    CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France.
  • Alice Valentini
    SPYGEN, Le Bourget du Lac, France.
  • Loïc Pellissier
    Swiss Federal Institute for Forest, Snow, and Landscape Research WSL, Birmensdorf, Switzerland.
  • Stéphanie Manel
    CEFE, Univ Montpellier, CNRS, EPHE-PSL University, IRD, Montpellier, France.