canSAR 2024-an update to the public drug discovery knowledgebase.

Journal: Nucleic acids research
Published Date:

Abstract

canSAR (https://cansar.ai) continues to serve as the largest publicly available platform for cancer-focused drug discovery and translational research. It integrates multidisciplinary data from disparate and otherwise siloed public data sources as well as data curated uniquely for canSAR. In addition, canSAR deploys a suite of curation and standardization tools together with AI algorithms to generate new knowledge from these integrated data to inform hypothesis generation. Here we report the latest updates to canSAR. As well as increasing available data, we provide enhancements to our algorithms to improve the offering to the user. Notably, our enhancements include a revised ligandability classifier leveraging Positive Unlabeled Learning that finds twice as many ligandable opportunities across the pocketome, and our revised chemical standardization pipeline and hierarchy better enables the aggregation of structurally related molecular records.

Authors

  • Phillip W Gingrich
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Rezvan Chitsazi
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Ansuman Biswas
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Chunjie Jiang
    College of Bioinformatics Science and Technology and Bio-Pharmaceutical Key Laboratory of Heilongjiang Province, Harbin Medical University, Nangang, Harbin, Heilongjiang, China.
  • Li Zhao
    International Initiative on Spatial Lifecourse Epidemiology (ISLE), the Netherlands; Department of Health Policy and Management, West China School of Public Health/West China Fourth Hospital, Sichuan University, Chengdu, Sichuan, 610041, China; Research Center for Healthy City Development, Sichuan University, Chengdu, Sichuan, 610041, China; Healthy Food Evaluation Research Center, Sichuan University, Chengdu, Sichuan, 610041, China.
  • Joseph E Tym
    All authors: The Institute of Cancer Research, London, United Kingdom.
  • Kevin M Brammer
    Enterprise Development and Integration, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Jun Li
    Department of Emergency, Zhuhai Integrated Traditional Chinese and Western Medicine Hospital, Zhuhai, 519020, Guangdong Province, China. quanshabai43@163.com.
  • Zhigang Shu
    Enterprise Development and Integration, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • David S Maxwell
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Jeffrey A Tacy
    Enterprise Development and Integration, University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Ioan L Mica
    Department of Data Science, The Institute of Cancer Research, London SM2 5NG, UK.
  • Michael Darkoh
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Patrizio Di Micco
    All authors: The Institute of Cancer Research, London, United Kingdom.
  • Kaitlyn P Russell
    Department of Genomic Medicine; Therapeutics Discovery Division; and The Institute for Data Science in Oncology; University of Texas MD Anderson Cancer Center, Houston, TX 77030, USA.
  • Paul Workman
    Cancer Research UK Cancer Therapeutics Unit, The Institute of Cancer Research, London SM2 5NG, UK.
  • Bissan Al-Lazikani
    All authors: The Institute of Cancer Research, London, United Kingdom.