Applying density-based outlier identifications using multiple datasets for validation of stroke clinical outcomes.

Journal: International journal of medical informatics
PMID:

Abstract

INTRODUCTION: Clinicians commonly use the modified Rankin Scale (mRS) and the Barthel Index (BI) to measure clinical outcome after stroke. These are potential targets in machine learning models for stroke outcome prediction. Therefore, the quality of the measurements is crucial for training and validation of these models. The objective of this study was to apply and evaluate density-based outlier detection methods for identifying potentially incorrect measurements in multiple large stroke datasets to assess the measurement quality.

Authors

  • Ching-Heng Lin
    Institute of Biomedical Informatics, National Yang-Ming University, Taipei, Taiwan.
  • Kai-Cheng Hsu
    Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States; Department of Neurology, National Taiwan University Hospital, Taipei, Taiwan.
  • Kory R Johnson
    Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States.
  • Marie Luby
    Stroke Branch, National Institute of Neurological Disorders and Stroke, National Institutes of Health, Bethesda, MD, United States.
  • Yang C Fann
    Bioinformatics Section, National Institute of Neurological Disorder and Stroke, National Institutes of Health, Bethesda, MD, United States. Electronic address: fann@ninds.nih.gov.