The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification

Hadi Kharrazi, Laura J. Anzaldi, Leilani Hernandez, Ashwini Davison, Cynthia M. Boyd, Bruce Leff, Joe Kimura, Jonathan P. Weiner

Research output: Contribution to journalArticlepeer-review

26 Scopus citations


Objectives: To examine the value of unstructured electronic health record (EHR) data (free-text notes) in identifying a set of geriatric syndromes. Design: Retrospective analysis of unstructured EHR notes using a natural language processing (NLP) algorithm. Setting: Large multispecialty group. Participants: Older adults (N=18,341; average age 75.9, 58.9% female). Measurements: We compared the number of geriatric syndrome cases identified using structured claims and structured and unstructured EHR data. We also calculated these rates using a population-level claims database as a reference and identified comparable epidemiological rates in peer-reviewed literature as a benchmark. Results: Using insurance claims data resulted in a geriatric syndrome prevalence ranging from 0.03% for lack of social support to 8.3% for walking difficulty. Using structured EHR data resulted in similar prevalence rates, ranging from 0.03% for malnutrition to 7.85% for walking difficulty. Incorporating unstructured EHR notes, enabled by applying the NLP algorithm, identified considerably higher rates of geriatric syndromes: absence of fecal control (2.1%, 2.3 times as much as structured claims and EHR data combined), decubitus ulcer (1.4%, 1.7 times as much), dementia (6.7%, 1.5 times as much), falls (23.6%, 3.2 times as much), malnutrition (2.5%, 18.0 times as much), lack of social support (29.8%, 455.9 times as much), urinary retention (4.2%, 3.9 times as much), vision impairment (6.2%, 7.4 times as much), weight loss (19.2%, 2.9 as much), and walking difficulty (36.34%, 3.4 as much). The geriatric syndrome rates extracted from structured data were substantially lower than published epidemiological rates, although adding the NLP results considerably closed this gap. Conclusion: Claims and structured EHR data give an incomplete picture of burden related to geriatric syndromes. Geriatric syndromes are likely to be missed if unstructured data are not analyzed. Pragmatic NLP algorithms can assist with identifying individuals at high risk of experiencing geriatric syndromes and improving coordination of care for older adults.

Original languageEnglish (US)
Pages (from-to)1499-1507
Number of pages9
JournalJournal of the American Geriatrics Society
Issue number8
StatePublished - Aug 1 2018


  • case identification
  • electronic health records
  • geriatric syndromes
  • natural language processing and text-mining
  • unstructured free-text data

ASJC Scopus subject areas

  • Geriatrics and Gerontology


Dive into the research topics of 'The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification'. Together they form a unique fingerprint.

Cite this