Detecting Uncoded Self-Harm in Veterans' Electronic Health Records Using Positive and Unlabeled Learning: Retrospective Cohort Study.
Journal:
Journal of medical Internet research
Published Date:
Jun 4, 2026
Abstract
BACKGROUND: Underdiagnosis and undercoding are common across mental health conditions, particularly suicide and self-harm. This leaves health care datasets lacking reliable negative examples needed for predictive modeling, phenotype prevalence estimation, and identification of individuals at elevated risk. We use positive and unlabeled (PU) learning to address this challenge. OBJECTIVE: This study aims to identify US Veterans whose self-harm events were not explicitly captured through diagnostic codes in electronic health records (EHRs) and estimate the underlying prevalence using a novel PU learning algorithm. METHODS: We performed a retrospective cohort study using Veterans Health Administration EHRs (from October 1, 1999, to August 31, 2019), selecting a random 25% sample of 1,329,120 Veterans out of 5,316,480 (1,193,563 males and 135,557 females) with at least 2 years of observation. The study cohort comprised 24,625 Veterans with coded self-harm and 1,304,495 uncoded, with the mean ages of 38.39 (SD 12.17) and 48.76 (SD 15.04) years, respectively. We applied our PULSNAR (positive unlabeled learning selected not at random) algorithm to estimate the proportion of individuals with uncoded self-harm. Covariates included age, medical conditions, procedures, and clinical observations. Four experts (raters) independently reviewed charts of 97 uncoded Veterans, each selected from 1% intervals of calibrated PULSNAR probabilities from 0.01 to 0.97. Agreement was assessed among raters, PULSNAR classifications, and consensus review decisions. Post hoc calibration was used to refine prevalence estimates. RESULTS: Of the 159,049 covariates in the dataset, PULSNAR's Extreme Gradient Boosting (XGBoost) model identified 1302 (0.82%) as informative for classification. Only 1.85% (24,625/1,329,120) of Veterans had diagnostic codes indicating self-harm events, while PULSNAR estimated an overall prevalence of 10.46% (139,026/1,329,120) by identifying an additional α=8.77% (114,404/1,304,495) of self-harm cases among the uncoded population. Of the 97 chart-reviewed patients, 39 had documented but uncoded self-harm. PULSNAR probabilities were post hoc calibrated such that their sum over the 97 cases equaled 39, which adjusted the combined coded and imputed prevalence downward from 10.46% to 7.91% (105,133/1,329,120). By applying this calibration to shift the probabilities of all uncoded Veterans, with bootstrapping for confidence intervals, PULSNAR estimates that coded self-harm represents only 23.4% (95% CI 17.76% to 31.51%) of all documented (coded+notes) self-harm. CONCLUSIONS: Under the "selected not at random" assumption, PULSNAR provides an innovative and scalable framework for estimating the clinically documented prevalence of mental health conditions and identifying the uncoded individuals with calibrated prediction, without requiring confirmed negative labels. This method offers an alternative to time-consuming chart reviews for detecting likely cases missing structured coding capture. By addressing diagnostic undercoding of mental health conditions in EHRs, this approach has the potential to enhance the estimation of mental health prevalence and support screening, activation of automated clinical decision support, targeted intervention, better resource allocation, and research to improve outcomes in real-world settings.
Authors
Keywords
No keywords available for this article.