Objective: To predict the medical expenditures of individual diabetics and assess the related factors of it. Design and setting: Cross-sectional study. Setting and participants: Data were collected from the US household component of the medical expenditure panel survey, 2000-2015. Main outcome measure: Random forest (RF) model was performed with the programs of randomForest in R software. Spearman correlation coefficients (rs), mean absolute error (MAE) and mean-related error (MRE) was computed to assess the prediction of all the models. Results: Total medical expenditure was increased from $105 Billion in 2000 to $318 Billion in 2015. rs, MAE and MRE between the predicted and actual values of medical expenditures in RF model were 0.644, $0.363 and 0.043%. Top one factor in prediction was being treated by the insulin, followed by type of insurance, employment status, age and economical level. The latter four variables had no impact in predicting of medical expenditure by being treated by the insulin. Further, after the sub-analysis of gender and age-groups, the evaluating indicators of prediction were almost identical to each other. Top five variables of total medical expenditure among male were same as those among all the diabetics. Expenses for doctor visits, hospital stay and drugs were also predicted with RF model well. Treatment with insulin was the top one factor of total medical expenditure among female, 18-, 25- and 65-age-groups. Additionally, it indicated that RF model was little superior to traditional regression model. Conclusions: RF model could be used in prediction of medical expenditure of diabetics and assessment of its related factors well.
- Medical expenditure
- Random forest
ASJC Scopus subject areas
- Health Policy
- Public Health, Environmental and Occupational Health