Targeted maximum likelihood estimation (TMLE) is a general method for estimating parameters in semiparametric and nonparametric models. The key step in any TMLE implementation is constructing a sequence of least-favorable parametric models for the parameter of interest. This has been done for a variety of parameters arising in causal inference problems, by augmenting standard regression models with a "clever-covariate." That approach requires deriving such a covariate for each new type of problem; for some problems such a covariate does not exist. To address these issues, we give a general TMLE implementation based on exponential families. This approach does not require deriving a clever-covariate, and it can be used to implement TMLE for estimating any smooth parameter in the nonparametric model. A computational advantage is that each iteration of TMLE involves estimation of a parameter in an exponential family, which is a convex optimization problem for which software implementing reliable and computationally efficient methods exists. We illustrate the method in three estimation problems, involving the mean of an outcome missing at random, the parameter of a median regression model, and the causal effect of a continuous exposure, respectively. We conduct a simulation study comparing different choices for the parametric submodel. We find that the choice of submodel can have an important impact on the behavior of the estimator in finite samples.
- convex optimization
- exponential family
ASJC Scopus subject areas
- Statistics and Probability
- Statistics, Probability and Uncertainty