A computer model trained on Veterans Health Administration records has uncovered what clinicians suspected but couldn't prove: self-harm history in medical records is largely invisible to the systems designed to track it. Researchers at the University of New Mexico School of Medicine developed a machine learning method that found four times more documented self-harm than traditional diagnosis codes reveal—a discovery that could reshape how health systems plan mental health services and clinicians identify at-risk patients.

The study, published in the Journal of Medical Internet Research, analyzed electronic health records for more than 1.3 million patients served by the VHA. What the researchers found was stark: while diagnosis codes captured only 1.85% of veterans with self-harm history, expert review of clinical notes showed the actual prevalence was 7.9%. That gap—from about 24,000 to more than 1 million patients—matters profoundly. Self-harm is one of the strongest predictors of future suicide risk and shapes how clinicians approach depression, PTSD, bipolar disorder, substance use, and traumatic brain injury.

The problem lies partly in how medical records are built. When a clinician documents a patient's condition in narrative notes, that information doesn't automatically translate into the structured diagnosis codes that researchers, administrators, and healthcare systems use to count and plan. "For research and planning, if we only count what is easy to see in diagnosis codes, we may substantially underestimate the need for mental health services," said Christophe Lambert, Ph.D., professor and interim chief of the Division of Translational Informatics at UNM and the study's corresponding author. Even problem lists—those summaries meant to flag critical health conditions for clinical teams—were unreliable: only 22.6% of veterans with a diagnosis code for self-harm also had it listed on their VHA problem list.

The researchers used a novel approach called PULSNAR—Positive Unlabeled Learning Selected Not At Random—designed to work with the messy reality of medical data. Traditional machine learning methods require clear "yes" and "no" examples, but medical records don't work that way. A missing diagnosis code doesn't mean a condition didn't exist; it just means it wasn't coded. PULSNAR learns from patients who do have a code, then estimates how many similar patients might be present among those without one, accounting for that fundamental uncertainty.

The gap reveals what Lambert calls a "systems-level visibility problem." Patient records can be enormous—some in the study contained more than 500,000 lines of notes. No clinician can reasonably read all of that during a visit. The solution isn't more data, but smarter ways to surface what matters most. Lambert said the findings could help "health systems plan better, help researchers study care more accurately, and eventually help clinicians know when a patient may need a closer look."

The VHA already uses specialized tools to monitor suicide risk and doesn't rely only on diagnosis codes for that critical work. But this study identifies a gap in the routine systems researchers and administrators use to assess need and allocate resources. As mental health crises deepen across the veteran population, making hidden history visible could be the difference between planning for adequacy and falling short of it.