Construction and application of SARS-CoV-2 protein ontology (CoVPO).

Journal: PloS one
PMID:

Abstract

The emergence of the SARS-CoV-2 virus and the resulting COVID-19 pandemic brought forth an urgent need for an in-depth molecular understanding, organization, and data integration to expedite therapeutic and preventive strategies. An essential approach to achieving this is through the development of a well-structured ontology of SARS-CoV-2 proteins. In response, this paper introduces CoVPO, a SARS-CoV-2 protein ontology that improves upon existing ontologies on protein function annotation and viral-to-viral protein interactions, highlighting their limited scope in covering all aspects of SARS-CoV-2 proteins. CoVPO extends classes from other relevant ontologies. Terms, annotations, and synonyms are added with proper definitions, clear origins, and an interaction map of viral-to-viral protein interactions is captured. We demonstrate CoVPO's application in an information retrieval system, expanding user queries by adding related terms or concepts. This approach helps overcome issues like term mismatch and improves the retrieval of relevant documents. The feasibility and superiority of the domain ontology model are demonstrated through experiments, showing that it outperforms traditional keyword-based searches and provides grounds for further research and discussion in the SARS-CoV-2 protein domain.

Authors

  • Aaron Nkhata
    Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, China.
  • Xiaona Shi
    Institute of Big Data and Information Technology, Wenzhou University, Wenzhou, China.
  • Yongjuan Zhang
    Department of Library, Information & Archives Shanghai University, Shanghai, China.
  • Heng Chen
    Medical College, Guizhou University, Jiaxiu Road, Huaxi Zone, Guiyang 550025, P. R. China.