Integration of diverse bioactivity data into the Chemical Checker compound universe.

Journal: Nature protocols
Published Date:

Abstract

Chemical signatures encode the physicochemical and structural properties of small molecules into numerical descriptors, forming the basis for chemical comparisons and search algorithms. The increasing availability of bioactivity data has improved compound representations to include biological effects (for example, induced gene expression changes), although bioactivity descriptors are often limited to a few well-documented molecules. To address this issue, we implemented a collection of deep neural networks able to leverage the experimentally determined bioactivity data associated to small molecules and infer the missing bioactivity signatures for any compound of interest. However, unlike static chemical descriptors, these bioactivity signatures dynamically evolve with new data and processing strategies. Here we present a computational protocol to modify or generate novel bioactivity spaces and signatures, describing the main steps needed to leverage diverse bioactivity data with the current knowledge, as catalogued in the Chemical Checker (CC; https://chemicalchecker.org/ ), using the predefined data curation pipeline. We illustrate the functioning of the protocol through four specific examples, including the incorporation of new compounds to an already existing bioactivity space, a change in the data preprocessing without altering the underlying experimental data and the creation of two novel bioactivity spaces from scratch, which are completed in under 9 h using graphics processing unit computing. Overall, this protocol offers a guideline for installing, testing and running the CC data integration approach on user-provided data, extending the annotation presented for a limited number of small molecules to a larger chemical landscape and generating novel bioactivity signatures.

Authors

  • Arnau Comajuncosa-Creus
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Martino Bertoni
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
  • Miquel Duran-Frigola
    CeMM Research Center for Molecular Medicine of the Austrian Academy of Sciences, 1090 Vienna, Austria.
  • Adrià Fernández-Torras
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
  • Oriol Guitart-Pla
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Nils Kurzawa
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Martina Locatelli
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Catalonia, Spain.
  • Yasmmin Martins
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Elena Pareja-Lorente
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Gema Rojas-Granado
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Nicolas Soler
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Eva Viesi
    Institute for Research in Biomedicine (IRB Barcelona), The Barcelona Institute of Science and Technology, Barcelona, Spain.
  • Patrick Aloy
    Institute for Research in Biomedicine (IRB Barcelona), the Barcelona Institute for Science and Technology, Barcelona, Spain.

Keywords

No keywords available for this article.