Smooth quantile ratio estimation with regression: Estimating medical expenditures for smoking-attributable diseases

Francesca Dominici, Scott Zeger

Research output: Contribution to journalArticle

Abstract

The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, as a function of the covariates. For example, let Y1 and Y2 be two nonnegative random variables denoting the health expenditures for cases and controls. Smooth Quantile Ratio Estimation (SQUARE) is a novel approach for estimating Δ = E[Y1] - E[Y 2] by smoothing across percentiles the log-transformed ratio of the two quantile functions. Dominici et al. (2005) have shown that SQUARE defines a large class of estimators of Δ, is more efficient than common parametric and nonparametric estimators of Δ, and is consistent and asymptotically normal. However, in applications it is often desirable to estimate Δ(x) = E[Y1|x] - E[Y2|x], that is, the difference in means as a function of x. In this paper we extend SQUARE to a regression model and we introduce a two-part regression SQUARE for estimating Δ(x) as a function of x. We use the first part of the model to estimate the probability of incurring any costs and the second part of the model to estimate the mean difference in health expenditures, given that a nonzero cost is observed. In the second part of the model, we apply the basic definition of SQUARE for positive costs to compare expenditures for the cases and controls having 'similar' covariate profiles. We determine strata of cases and control with 'similar' covariate profiles by the use of propensity score matching. We then apply two-part regression SQUARE to the 1987 National Medicare Expenditure Survey to estimate the difference Δ(x) between persons suffering from smoking-attributable diseases and persons without these diseases as a function of the propensity of getting the disease. Using a simulation study, we compare frequentist properties of two-part regression SQUARE with maximum likelihood estimators for the log-transformed expenditures.

Original languageEnglish (US)
Pages (from-to)505-519
Number of pages15
JournalBiostatistics
Volume6
Issue number4
DOIs
StatePublished - Oct 2005

Fingerprint

Smoking
Health Expenditures
Quantile
Regression
Covariates
Costs and Cost Analysis
Estimate
Propensity Score
Person
Costs
Health
Medicare
Quantile Function
Medical expenditures
Percentile
Nonparametric Estimator
Econometrics
Maximum Likelihood Estimator
Smoothing
Regression Model

Keywords

  • Comparing means
  • Health expenditures
  • Log-normal
  • Propensity scores
  • Q-Q plots
  • Quantile regression
  • Regression splines
  • Skewed distributions
  • Smoking

ASJC Scopus subject areas

  • Medicine(all)
  • Statistics and Probability
  • Statistics, Probability and Uncertainty

Cite this

Smooth quantile ratio estimation with regression : Estimating medical expenditures for smoking-attributable diseases. / Dominici, Francesca; Zeger, Scott.

In: Biostatistics, Vol. 6, No. 4, 10.2005, p. 505-519.

Research output: Contribution to journalArticle

@article{db21bf2b1d734f89b4741f98e5cfc99e,
title = "Smooth quantile ratio estimation with regression: Estimating medical expenditures for smoking-attributable diseases",
abstract = "The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, as a function of the covariates. For example, let Y1 and Y2 be two nonnegative random variables denoting the health expenditures for cases and controls. Smooth Quantile Ratio Estimation (SQUARE) is a novel approach for estimating Δ = E[Y1] - E[Y 2] by smoothing across percentiles the log-transformed ratio of the two quantile functions. Dominici et al. (2005) have shown that SQUARE defines a large class of estimators of Δ, is more efficient than common parametric and nonparametric estimators of Δ, and is consistent and asymptotically normal. However, in applications it is often desirable to estimate Δ(x) = E[Y1|x] - E[Y2|x], that is, the difference in means as a function of x. In this paper we extend SQUARE to a regression model and we introduce a two-part regression SQUARE for estimating Δ(x) as a function of x. We use the first part of the model to estimate the probability of incurring any costs and the second part of the model to estimate the mean difference in health expenditures, given that a nonzero cost is observed. In the second part of the model, we apply the basic definition of SQUARE for positive costs to compare expenditures for the cases and controls having 'similar' covariate profiles. We determine strata of cases and control with 'similar' covariate profiles by the use of propensity score matching. We then apply two-part regression SQUARE to the 1987 National Medicare Expenditure Survey to estimate the difference Δ(x) between persons suffering from smoking-attributable diseases and persons without these diseases as a function of the propensity of getting the disease. Using a simulation study, we compare frequentist properties of two-part regression SQUARE with maximum likelihood estimators for the log-transformed expenditures.",
keywords = "Comparing means, Health expenditures, Log-normal, Propensity scores, Q-Q plots, Quantile regression, Regression splines, Skewed distributions, Smoking",
author = "Francesca Dominici and Scott Zeger",
year = "2005",
month = "10",
doi = "10.1093/biostatistics/kxi031",
language = "English (US)",
volume = "6",
pages = "505--519",
journal = "Biostatistics",
issn = "1465-4644",
publisher = "Oxford University Press",
number = "4",

}

TY - JOUR

T1 - Smooth quantile ratio estimation with regression

T2 - Estimating medical expenditures for smoking-attributable diseases

AU - Dominici, Francesca

AU - Zeger, Scott

PY - 2005/10

Y1 - 2005/10

N2 - The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, as a function of the covariates. For example, let Y1 and Y2 be two nonnegative random variables denoting the health expenditures for cases and controls. Smooth Quantile Ratio Estimation (SQUARE) is a novel approach for estimating Δ = E[Y1] - E[Y 2] by smoothing across percentiles the log-transformed ratio of the two quantile functions. Dominici et al. (2005) have shown that SQUARE defines a large class of estimators of Δ, is more efficient than common parametric and nonparametric estimators of Δ, and is consistent and asymptotically normal. However, in applications it is often desirable to estimate Δ(x) = E[Y1|x] - E[Y2|x], that is, the difference in means as a function of x. In this paper we extend SQUARE to a regression model and we introduce a two-part regression SQUARE for estimating Δ(x) as a function of x. We use the first part of the model to estimate the probability of incurring any costs and the second part of the model to estimate the mean difference in health expenditures, given that a nonzero cost is observed. In the second part of the model, we apply the basic definition of SQUARE for positive costs to compare expenditures for the cases and controls having 'similar' covariate profiles. We determine strata of cases and control with 'similar' covariate profiles by the use of propensity score matching. We then apply two-part regression SQUARE to the 1987 National Medicare Expenditure Survey to estimate the difference Δ(x) between persons suffering from smoking-attributable diseases and persons without these diseases as a function of the propensity of getting the disease. Using a simulation study, we compare frequentist properties of two-part regression SQUARE with maximum likelihood estimators for the log-transformed expenditures.

AB - The methodological development of this paper is motivated by a common problem in econometrics where we are interested in estimating the difference in the average expenditures between two populations, say with and without a disease, as a function of the covariates. For example, let Y1 and Y2 be two nonnegative random variables denoting the health expenditures for cases and controls. Smooth Quantile Ratio Estimation (SQUARE) is a novel approach for estimating Δ = E[Y1] - E[Y 2] by smoothing across percentiles the log-transformed ratio of the two quantile functions. Dominici et al. (2005) have shown that SQUARE defines a large class of estimators of Δ, is more efficient than common parametric and nonparametric estimators of Δ, and is consistent and asymptotically normal. However, in applications it is often desirable to estimate Δ(x) = E[Y1|x] - E[Y2|x], that is, the difference in means as a function of x. In this paper we extend SQUARE to a regression model and we introduce a two-part regression SQUARE for estimating Δ(x) as a function of x. We use the first part of the model to estimate the probability of incurring any costs and the second part of the model to estimate the mean difference in health expenditures, given that a nonzero cost is observed. In the second part of the model, we apply the basic definition of SQUARE for positive costs to compare expenditures for the cases and controls having 'similar' covariate profiles. We determine strata of cases and control with 'similar' covariate profiles by the use of propensity score matching. We then apply two-part regression SQUARE to the 1987 National Medicare Expenditure Survey to estimate the difference Δ(x) between persons suffering from smoking-attributable diseases and persons without these diseases as a function of the propensity of getting the disease. Using a simulation study, we compare frequentist properties of two-part regression SQUARE with maximum likelihood estimators for the log-transformed expenditures.

KW - Comparing means

KW - Health expenditures

KW - Log-normal

KW - Propensity scores

KW - Q-Q plots

KW - Quantile regression

KW - Regression splines

KW - Skewed distributions

KW - Smoking

UR - http://www.scopus.com/inward/record.url?scp=26644458846&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=26644458846&partnerID=8YFLogxK

U2 - 10.1093/biostatistics/kxi031

DO - 10.1093/biostatistics/kxi031

M3 - Article

C2 - 15872022

AN - SCOPUS:26644458846

VL - 6

SP - 505

EP - 519

JO - Biostatistics

JF - Biostatistics

SN - 1465-4644

IS - 4

ER -