Exploring the use of machine learning for risk adjustment: A comparison of standard and penalized linear regression models in predicting health care costs in older adults

Research output: Contribution to journalArticle

Abstract

Background Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practical and incremental step forward that provides transparency and interpretability within the familiar regression framework. This study conducted an in-depth comparison of prediction performance of standard and penalized linear regression in predicting future health care costs in older adults. Methods and findings This retrospective cohort study included 81,106 Medicare Advantage patients with 5 years of continuous medical and pharmacy insurance from 2009 to 2013. Total health care costs in 2013 were predicted with comorbidity indicators from 2009 to 2012. Using 2012 predictors only, OLS performed poorly (e.g., R 2 = 16.3%) compared to penalized linear regression models (R 2 ranging from 16.8 to 16.9%); using 2009–2012 predictors, the gap in prediction performance increased (R 2 :15.0% versus 18.0–18.2%). OLS with a reduced set of predictors selected by lasso showed improved performance (R 2 = 16.6% with 2012 predictors, 17.4% with 2009–2012 predictors) relative to OLS without variable selection but still lagged behind the prediction performance of penalized regression. Lasso regression consistently generated prediction ratios closer to 1 across different levels of predicted risk compared to other models. Conclusions This study demonstrated the advantages of using transparent and easy-to-interpret penalized linear regression for predicting future health care costs in older adults relative to standard linear regression. Penalized regression showed better performance than OLS in predicting health care costs. Applying penalized regression to longitudinal data increased prediction accuracy. Lasso regression in particular showed superior prediction ratios across low and high levels of predicted risk. Health care insurers, providers and policy makers may benefit from adopting penalized regression such as lasso regression for cost prediction to improve risk adjustment and population health management and thus better address the underlying needs and risk of the populations they serve.

Original languageEnglish (US)
Article numbere0213258
JournalPloS one
Volume14
Issue number3
DOIs
StatePublished - Mar 1 2019

Fingerprint

health care costs
Risk Adjustment
artificial intelligence
Health care
Linear regression
Health Care Costs
Learning systems
Linear Models
Least-Squares Analysis
least squares
prediction
Costs
Medicare Part C
Insurance Carriers
Health Policy
Insurance
Administrative Personnel
insurance
Health Personnel
Population

ASJC Scopus subject areas

  • Biochemistry, Genetics and Molecular Biology(all)
  • Agricultural and Biological Sciences(all)

Cite this

@article{49f745f8bb5849c2868116254a061063,
title = "Exploring the use of machine learning for risk adjustment: A comparison of standard and penalized linear regression models in predicting health care costs in older adults",
abstract = "Background Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practical and incremental step forward that provides transparency and interpretability within the familiar regression framework. This study conducted an in-depth comparison of prediction performance of standard and penalized linear regression in predicting future health care costs in older adults. Methods and findings This retrospective cohort study included 81,106 Medicare Advantage patients with 5 years of continuous medical and pharmacy insurance from 2009 to 2013. Total health care costs in 2013 were predicted with comorbidity indicators from 2009 to 2012. Using 2012 predictors only, OLS performed poorly (e.g., R 2 = 16.3{\%}) compared to penalized linear regression models (R 2 ranging from 16.8 to 16.9{\%}); using 2009–2012 predictors, the gap in prediction performance increased (R 2 :15.0{\%} versus 18.0–18.2{\%}). OLS with a reduced set of predictors selected by lasso showed improved performance (R 2 = 16.6{\%} with 2012 predictors, 17.4{\%} with 2009–2012 predictors) relative to OLS without variable selection but still lagged behind the prediction performance of penalized regression. Lasso regression consistently generated prediction ratios closer to 1 across different levels of predicted risk compared to other models. Conclusions This study demonstrated the advantages of using transparent and easy-to-interpret penalized linear regression for predicting future health care costs in older adults relative to standard linear regression. Penalized regression showed better performance than OLS in predicting health care costs. Applying penalized regression to longitudinal data increased prediction accuracy. Lasso regression in particular showed superior prediction ratios across low and high levels of predicted risk. Health care insurers, providers and policy makers may benefit from adopting penalized regression such as lasso regression for cost prediction to improve risk adjustment and population health management and thus better address the underlying needs and risk of the populations they serve.",
author = "Hongjun Kan and Kharrazi, {Hadi H K} and Hsien-Yen Chang and David Bodycombe and Klaus Lemke and Jonathan Weiner",
year = "2019",
month = "3",
day = "1",
doi = "10.1371/journal.pone.0213258",
language = "English (US)",
volume = "14",
journal = "PLoS One",
issn = "1932-6203",
publisher = "Public Library of Science",
number = "3",

}

TY - JOUR

T1 - Exploring the use of machine learning for risk adjustment

T2 - A comparison of standard and penalized linear regression models in predicting health care costs in older adults

AU - Kan, Hongjun

AU - Kharrazi, Hadi H K

AU - Chang, Hsien-Yen

AU - Bodycombe, David

AU - Lemke, Klaus

AU - Weiner, Jonathan

PY - 2019/3/1

Y1 - 2019/3/1

N2 - Background Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practical and incremental step forward that provides transparency and interpretability within the familiar regression framework. This study conducted an in-depth comparison of prediction performance of standard and penalized linear regression in predicting future health care costs in older adults. Methods and findings This retrospective cohort study included 81,106 Medicare Advantage patients with 5 years of continuous medical and pharmacy insurance from 2009 to 2013. Total health care costs in 2013 were predicted with comorbidity indicators from 2009 to 2012. Using 2012 predictors only, OLS performed poorly (e.g., R 2 = 16.3%) compared to penalized linear regression models (R 2 ranging from 16.8 to 16.9%); using 2009–2012 predictors, the gap in prediction performance increased (R 2 :15.0% versus 18.0–18.2%). OLS with a reduced set of predictors selected by lasso showed improved performance (R 2 = 16.6% with 2012 predictors, 17.4% with 2009–2012 predictors) relative to OLS without variable selection but still lagged behind the prediction performance of penalized regression. Lasso regression consistently generated prediction ratios closer to 1 across different levels of predicted risk compared to other models. Conclusions This study demonstrated the advantages of using transparent and easy-to-interpret penalized linear regression for predicting future health care costs in older adults relative to standard linear regression. Penalized regression showed better performance than OLS in predicting health care costs. Applying penalized regression to longitudinal data increased prediction accuracy. Lasso regression in particular showed superior prediction ratios across low and high levels of predicted risk. Health care insurers, providers and policy makers may benefit from adopting penalized regression such as lasso regression for cost prediction to improve risk adjustment and population health management and thus better address the underlying needs and risk of the populations they serve.

AB - Background Payers and providers still primarily use ordinary least squares (OLS) to estimate expected economic and clinical outcomes for risk adjustment purposes. Penalized linear regression represents a practical and incremental step forward that provides transparency and interpretability within the familiar regression framework. This study conducted an in-depth comparison of prediction performance of standard and penalized linear regression in predicting future health care costs in older adults. Methods and findings This retrospective cohort study included 81,106 Medicare Advantage patients with 5 years of continuous medical and pharmacy insurance from 2009 to 2013. Total health care costs in 2013 were predicted with comorbidity indicators from 2009 to 2012. Using 2012 predictors only, OLS performed poorly (e.g., R 2 = 16.3%) compared to penalized linear regression models (R 2 ranging from 16.8 to 16.9%); using 2009–2012 predictors, the gap in prediction performance increased (R 2 :15.0% versus 18.0–18.2%). OLS with a reduced set of predictors selected by lasso showed improved performance (R 2 = 16.6% with 2012 predictors, 17.4% with 2009–2012 predictors) relative to OLS without variable selection but still lagged behind the prediction performance of penalized regression. Lasso regression consistently generated prediction ratios closer to 1 across different levels of predicted risk compared to other models. Conclusions This study demonstrated the advantages of using transparent and easy-to-interpret penalized linear regression for predicting future health care costs in older adults relative to standard linear regression. Penalized regression showed better performance than OLS in predicting health care costs. Applying penalized regression to longitudinal data increased prediction accuracy. Lasso regression in particular showed superior prediction ratios across low and high levels of predicted risk. Health care insurers, providers and policy makers may benefit from adopting penalized regression such as lasso regression for cost prediction to improve risk adjustment and population health management and thus better address the underlying needs and risk of the populations they serve.

UR - http://www.scopus.com/inward/record.url?scp=85062613499&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85062613499&partnerID=8YFLogxK

U2 - 10.1371/journal.pone.0213258

DO - 10.1371/journal.pone.0213258

M3 - Article

C2 - 30840682

AN - SCOPUS:85062613499

VL - 14

JO - PLoS One

JF - PLoS One

SN - 1932-6203

IS - 3

M1 - e0213258

ER -