Model selection and health effect estimation in environmental epidemiology

Francesca Dominici; Chi Wang; Ciprian Crainiceanu; Giovanni Parmigiani

doi:10.1097/EDE.0b013e31817307dc

Model selection and health effect estimation in environmental epidemiology

Francesca Dominici, Chi Wang, Ciprian Crainiceanu, Giovanni Parmigiani

Bloomberg School of Public Health

Research output: Contribution to journal › Comment/debate › peer-review

20 Scopus citations

Abstract

In air pollution epidemiology, improvements in statistical analysis tools can help improve signal-to-noise ratios, and untangle large correlations between exposures and confounders. For this reason, we welcome a novel model-selection approach that helps to identify the time-windows of exposure to pollutants that produces adverse health effects. However, there are concerns about approaches that select a model based on a given data set, and then estimate health effects in the same data. This can create problems when (1) the sample size is small in relation to the magnitude of the health effects; and (2) candidate predictors are highly correlated and likely to have similar effects. Bayesian Model Averaging has been advocated as a way to estimate health effects that accounts for model uncertainty. However, implementations where posterior model probabilities are approximated using BIC, as well as other default choices, may not reflect the ability of each model to provide an estimate of the health effect that is properly adjusted for confounding. Air pollution studies need to focus on estimating health effects while accounting for the uncertainty in the adjustment for confounding factors. This is true especially when model choice and estimation are performed on the same data. The development of appropriate statistical tools remains an open area of investigation.

Original language	English (US)
Pages (from-to)	558-560
Number of pages	3
Journal	Epidemiology
Volume	19
Issue number	4
DOIs	https://doi.org/10.1097/EDE.0b013e31817307dc
State	Published - Jul 2008

ASJC Scopus subject areas

Epidemiology

Access to Document

10.1097/EDE.0b013e31817307dc

Cite this

@article{34d7e51f82ca48ca821b30c82306c13d,

title = "Model selection and health effect estimation in environmental epidemiology",

abstract = "In air pollution epidemiology, improvements in statistical analysis tools can help improve signal-to-noise ratios, and untangle large correlations between exposures and confounders. For this reason, we welcome a novel model-selection approach that helps to identify the time-windows of exposure to pollutants that produces adverse health effects. However, there are concerns about approaches that select a model based on a given data set, and then estimate health effects in the same data. This can create problems when (1) the sample size is small in relation to the magnitude of the health effects; and (2) candidate predictors are highly correlated and likely to have similar effects. Bayesian Model Averaging has been advocated as a way to estimate health effects that accounts for model uncertainty. However, implementations where posterior model probabilities are approximated using BIC, as well as other default choices, may not reflect the ability of each model to provide an estimate of the health effect that is properly adjusted for confounding. Air pollution studies need to focus on estimating health effects while accounting for the uncertainty in the adjustment for confounding factors. This is true especially when model choice and estimation are performed on the same data. The development of appropriate statistical tools remains an open area of investigation.",

author = "Francesca Dominici and Chi Wang and Ciprian Crainiceanu and Giovanni Parmigiani",

year = "2008",

month = jul,

doi = "10.1097/EDE.0b013e31817307dc",

language = "English (US)",

volume = "19",

pages = "558--560",

journal = "Epidemiology",

issn = "1044-3983",

publisher = "Lippincott Williams and Wilkins",

number = "4",

}

TY - JOUR

T1 - Model selection and health effect estimation in environmental epidemiology

AU - Dominici, Francesca

AU - Wang, Chi

AU - Crainiceanu, Ciprian

AU - Parmigiani, Giovanni

PY - 2008/7

Y1 - 2008/7

N2 - In air pollution epidemiology, improvements in statistical analysis tools can help improve signal-to-noise ratios, and untangle large correlations between exposures and confounders. For this reason, we welcome a novel model-selection approach that helps to identify the time-windows of exposure to pollutants that produces adverse health effects. However, there are concerns about approaches that select a model based on a given data set, and then estimate health effects in the same data. This can create problems when (1) the sample size is small in relation to the magnitude of the health effects; and (2) candidate predictors are highly correlated and likely to have similar effects. Bayesian Model Averaging has been advocated as a way to estimate health effects that accounts for model uncertainty. However, implementations where posterior model probabilities are approximated using BIC, as well as other default choices, may not reflect the ability of each model to provide an estimate of the health effect that is properly adjusted for confounding. Air pollution studies need to focus on estimating health effects while accounting for the uncertainty in the adjustment for confounding factors. This is true especially when model choice and estimation are performed on the same data. The development of appropriate statistical tools remains an open area of investigation.

AB - In air pollution epidemiology, improvements in statistical analysis tools can help improve signal-to-noise ratios, and untangle large correlations between exposures and confounders. For this reason, we welcome a novel model-selection approach that helps to identify the time-windows of exposure to pollutants that produces adverse health effects. However, there are concerns about approaches that select a model based on a given data set, and then estimate health effects in the same data. This can create problems when (1) the sample size is small in relation to the magnitude of the health effects; and (2) candidate predictors are highly correlated and likely to have similar effects. Bayesian Model Averaging has been advocated as a way to estimate health effects that accounts for model uncertainty. However, implementations where posterior model probabilities are approximated using BIC, as well as other default choices, may not reflect the ability of each model to provide an estimate of the health effect that is properly adjusted for confounding. Air pollution studies need to focus on estimating health effects while accounting for the uncertainty in the adjustment for confounding factors. This is true especially when model choice and estimation are performed on the same data. The development of appropriate statistical tools remains an open area of investigation.

UR - http://www.scopus.com/inward/record.url?scp=49849095313&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=49849095313&partnerID=8YFLogxK

U2 - 10.1097/EDE.0b013e31817307dc

DO - 10.1097/EDE.0b013e31817307dc

M3 - Comment/debate

C2 - 18552590

AN - SCOPUS:49849095313

SN - 1044-3983

VL - 19

SP - 558

EP - 560

JO - Epidemiology

JF - Epidemiology

IS - 4

ER -

Model selection and health effect estimation in environmental epidemiology

Abstract

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this