NLP-Assisted Pipeline for COVID-19 Core Outcome Set Identification Using ClinicalTrials.gov.

Journal: Studies in health technology and informatics

PMID: 35673091

Abstract

Core outcome sets (COS) are necessary to ensure the systematic collection, metadata analysis and sharing the information across studies. However, development of an area-specific clinical research is costly and time consuming. ClinicalTrials.gov, as a public repository, provides access to a vast collection of clinical trials and their characteristics such as primary outcomes. With the growing number of COVID-19 clinical trials, identifying COSs from outcomes of such trials is crucial. This paper introduces a semi-automatic pipeline that can efficiently identify, aggregate and rank the COS from the primary outcomes of COVID-19 clinical trials. Using Natural language processing (NLP) techniques, our proposed pipeline successfully downloads and processes 5090 trials from all over the world and identifies COVID-19-specific outcomes that appeared in more than 1% of the trials. The top-of-the-list outcomes identified by the pipeline are mortality due to COVID-19, COVID-19 infection rate and COVID-19 symptoms.

Authors

Fatemeh Shah-Mohammadi

Department of Biomedical Informatics, School of Medicine, University of Utah, USA.
Irena Parvanova

Center for Biomedical and Population Health Informatics, Icahn School of Medicine at Mount Sinai, New York, NY, USA.
Joseph Finkelstein

Department of Biomedical Informatics, School of Medicine, University of Utah, USA.

Keywords

Clinical Trials as Topic COVID-19 Humans Natural Language Processing Outcome Assessment, Health Care

External Resources

View on PubMed Access via DOI PubMed (35673091)

NLP-Assisted Pipeline for COVID-19 Core Outcome Set Identification Using ClinicalTrials.gov.

Abstract

Authors

Keywords

External Resources

Popular Topics

Recent Journals