TY - JOUR
T1 - The Value of Unstructured Electronic Health Record Data in Geriatric Syndrome Case Identification
AU - Kharrazi, Hadi
AU - Anzaldi, Laura J.
AU - Hernandez, Leilani
AU - Davison, Ashwini
AU - Boyd, Cynthia M.
AU - Leff, Bruce
AU - Kimura, Joe
AU - Weiner, Jonathan P.
N1 - Publisher Copyright:
© 2018, Copyright the Authors Journal compilation © 2018, The American Geriatrics Society
PY - 2018/8
Y1 - 2018/8
N2 - Objectives: To examine the value of unstructured electronic health record (EHR) data (free-text notes) in identifying a set of geriatric syndromes. Design: Retrospective analysis of unstructured EHR notes using a natural language processing (NLP) algorithm. Setting: Large multispecialty group. Participants: Older adults (N=18,341; average age 75.9, 58.9% female). Measurements: We compared the number of geriatric syndrome cases identified using structured claims and structured and unstructured EHR data. We also calculated these rates using a population-level claims database as a reference and identified comparable epidemiological rates in peer-reviewed literature as a benchmark. Results: Using insurance claims data resulted in a geriatric syndrome prevalence ranging from 0.03% for lack of social support to 8.3% for walking difficulty. Using structured EHR data resulted in similar prevalence rates, ranging from 0.03% for malnutrition to 7.85% for walking difficulty. Incorporating unstructured EHR notes, enabled by applying the NLP algorithm, identified considerably higher rates of geriatric syndromes: absence of fecal control (2.1%, 2.3 times as much as structured claims and EHR data combined), decubitus ulcer (1.4%, 1.7 times as much), dementia (6.7%, 1.5 times as much), falls (23.6%, 3.2 times as much), malnutrition (2.5%, 18.0 times as much), lack of social support (29.8%, 455.9 times as much), urinary retention (4.2%, 3.9 times as much), vision impairment (6.2%, 7.4 times as much), weight loss (19.2%, 2.9 as much), and walking difficulty (36.34%, 3.4 as much). The geriatric syndrome rates extracted from structured data were substantially lower than published epidemiological rates, although adding the NLP results considerably closed this gap. Conclusion: Claims and structured EHR data give an incomplete picture of burden related to geriatric syndromes. Geriatric syndromes are likely to be missed if unstructured data are not analyzed. Pragmatic NLP algorithms can assist with identifying individuals at high risk of experiencing geriatric syndromes and improving coordination of care for older adults.
AB - Objectives: To examine the value of unstructured electronic health record (EHR) data (free-text notes) in identifying a set of geriatric syndromes. Design: Retrospective analysis of unstructured EHR notes using a natural language processing (NLP) algorithm. Setting: Large multispecialty group. Participants: Older adults (N=18,341; average age 75.9, 58.9% female). Measurements: We compared the number of geriatric syndrome cases identified using structured claims and structured and unstructured EHR data. We also calculated these rates using a population-level claims database as a reference and identified comparable epidemiological rates in peer-reviewed literature as a benchmark. Results: Using insurance claims data resulted in a geriatric syndrome prevalence ranging from 0.03% for lack of social support to 8.3% for walking difficulty. Using structured EHR data resulted in similar prevalence rates, ranging from 0.03% for malnutrition to 7.85% for walking difficulty. Incorporating unstructured EHR notes, enabled by applying the NLP algorithm, identified considerably higher rates of geriatric syndromes: absence of fecal control (2.1%, 2.3 times as much as structured claims and EHR data combined), decubitus ulcer (1.4%, 1.7 times as much), dementia (6.7%, 1.5 times as much), falls (23.6%, 3.2 times as much), malnutrition (2.5%, 18.0 times as much), lack of social support (29.8%, 455.9 times as much), urinary retention (4.2%, 3.9 times as much), vision impairment (6.2%, 7.4 times as much), weight loss (19.2%, 2.9 as much), and walking difficulty (36.34%, 3.4 as much). The geriatric syndrome rates extracted from structured data were substantially lower than published epidemiological rates, although adding the NLP results considerably closed this gap. Conclusion: Claims and structured EHR data give an incomplete picture of burden related to geriatric syndromes. Geriatric syndromes are likely to be missed if unstructured data are not analyzed. Pragmatic NLP algorithms can assist with identifying individuals at high risk of experiencing geriatric syndromes and improving coordination of care for older adults.
KW - case identification
KW - electronic health records
KW - geriatric syndromes
KW - natural language processing and text-mining
KW - unstructured free-text data
UR - http://www.scopus.com/inward/record.url?scp=85053060692&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85053060692&partnerID=8YFLogxK
U2 - 10.1111/jgs.15411
DO - 10.1111/jgs.15411
M3 - Article
C2 - 29972595
AN - SCOPUS:85053060692
SN - 0002-8614
VL - 66
SP - 1499
EP - 1507
JO - Journal of the American Geriatrics Society
JF - Journal of the American Geriatrics Society
IS - 8
ER -