Cross modality learning of cell painting and transcriptomics data improves mechanism of action clustering and bioactivity modelling.

Journal: Scientific reports
Published Date:

Abstract

In drug discovery, different data modalities (chemical structure, cell biology, quantum mechanics, etc.) are abundant, and their integration can help with understanding aspects of chemistry, biology, and their interactions. Within cell biology, cell painting (CP) and transcriptomics RNA-Seq (TX) screens are powerful tools in early drug discovery, as they are complementary views of the biological effect of compounds on a population of cells post-treatment. While multimodal learning of chemical structure-cell painting, or different omics data has been experimented; a cell painting-bulk transcriptomics multimodal model is still unexplored. In this work, we benchmark two representation learning methods: contrastive learning and bimodal autoencoder. We use the setting of cross modality learning where representation learning is performed with two modalities (CP and TX), but only cell painting is available for new compounds embeddings generation and downstream task. This is because for new compounds, we would only have CP data and not TX, due to high data generation cost of the RNA-Seq screen. We show that in the absence of TX features for new compounds, using learned embeddings like those obtained from Constrastive Learning enhances performance of CP features on tasks where TX features excels but CP features does not. Additionally, we observed that learned representation improves cluster quality for clustering of CP replicates and different mechanisms of action (MoA), as well as improves performance on several subsets of bioactivity tasks grouped by protein target families.

Authors

  • Son V Ha
    Johnson & Johnson, Beerse, Belgium.
  • Steffen Jaensch
    Discovery Technology and Molecular Pharmacology, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, Beerse B-2340, Belgium.
  • Maciej M Kańduła
    Discovery Technology and Molecular Pharmacology, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, Beerse B-2340, Belgium.
  • Dorota Herman
    In-Silico Discovery, Janssen Research & Development, Pharmaceutical Companies of Johnson & Johnson, Beerse B-2340, Belgium.
  • Paul Czodrowski
    Technical University of Dortmund, Dortmund, Germany.
  • Hugo Ceulemans
    Janssen Pharmaceutica NV, Turnhoutseweg 30, 2340 Beerse, Belgium. Electronic address: hceulema@its.jnj.com.