Model choice in time series studies of air pollution and mortality

Roger D. Peng; Francesca Dominici; Thomas A. Louis

doi:10.1111/j.1467-985X.2006.00410.x

Model choice in time series studies of air pollution and mortality

Roger D. Peng, Francesca Dominici, Thomas A. Louis

Bloomberg School of Public Health

Research output: Contribution to journal › Review article › peer-review

367 Scopus citations

Abstract

Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM₁₀ at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

Original language	English (US)
Pages (from-to)	179-203
Number of pages	25
Journal	Journal of the Royal Statistical Society. Series A: Statistics in Society
Volume	169
Issue number	2
DOIs	https://doi.org/10.1111/j.1467-985X.2006.00410.x
State	Published - Mar 2006

Keywords

Air pollution
Log-linear regression
Mortality
Semiparametric regression
Time series

ASJC Scopus subject areas

Statistics and Probability
Social Sciences (miscellaneous)
Economics and Econometrics
Statistics, Probability and Uncertainty

Access to Document

10.1111/j.1467-985X.2006.00410.x

Cite this

@article{66d8311eb8664af6b42c6bdeee5ba667,

title = "Model choice in time series studies of air pollution and mortality",

abstract = "Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.",

keywords = "Air pollution, Log-linear regression, Mortality, Semiparametric regression, Time series",

author = "Peng, {Roger D.} and Francesca Dominici and Louis, {Thomas A.}",

year = "2006",

month = mar,

doi = "10.1111/j.1467-985X.2006.00410.x",

language = "English (US)",

volume = "169",

pages = "179--203",

journal = "Journal of the Royal Statistical Society. Series A: Statistics in Society",

issn = "0964-1998",

publisher = "Wiley-Blackwell",

number = "2",

}

TY - JOUR

T1 - Model choice in time series studies of air pollution and mortality

AU - Peng, Roger D.

AU - Dominici, Francesca

AU - Louis, Thomas A.

PY - 2006/3

Y1 - 2006/3

N2 - Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

AB - Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

KW - Air pollution

KW - Log-linear regression

KW - Mortality

KW - Semiparametric regression

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=33644794010&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33644794010&partnerID=8YFLogxK

U2 - 10.1111/j.1467-985X.2006.00410.x

DO - 10.1111/j.1467-985X.2006.00410.x

M3 - Review article

AN - SCOPUS:33644794010

SN - 0964-1998

VL - 169

SP - 179

EP - 203

JO - Journal of the Royal Statistical Society. Series A: Statistics in Society

JF - Journal of the Royal Statistical Society. Series A: Statistics in Society

IS - 2

ER -

Model choice in time series studies of air pollution and mortality

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this