Cancer Phenotype Development: A Literature Review.

Journal: Studies in health technology and informatics
Published Date:

Abstract

EHR-based, computable phenotypes can be leveraged by healthcare organizations and researchers to improve the cohort identification process. The ability to identify patient cohorts using aspects of care and outcomes based on clinical characteristics or diagnostic conditions and/or risk factors presents opportunities to researchers targeting specific populations for drug development and disease interventions. The objective of this review was to summarize the literature describing the development and use of phenotypes for cohort identification of cancer patients. A survey of the literature indexed in PubMed was performed to identify studies using EHR-based phenotypes for use in cancer studies. Specific search criteria were formulated by leveraging a phenotype identification guideline developed by the Phenotypes, Data Standards, and Data Quality Core of the NIH Health Care Systems Research Collaboratory. The final set of articles was examined further to identify 1) the cancer of interest and 2) the different approaches used for phenotype development, validation and implementation. The articles reviewed were specific to breast cancer, colorectal cancer, ovarian cancer, and lung cancer. The approaches taken for phenotype development and validation varied slightly among the relevant publications. Four studies relied on chart review, three utilized machine learning techniques, one took an ontological approach, and one utilized natural language processing (NLP).

Authors

  • Pei Wang
    College of Engineering and Technology, Key Laboratory of Agricultural Equipment for Hilly and Mountain Areas, Southwest University, Chongqing, China.
  • Maryam Garza
    University of Arkansas for Medical Sciences, Little Rock, Arkansas.
  • Meredith Zozus
    University of Arkansas for Medical Sciences, Little Rock, Arkansas.