Model choice in time series studies of air pollution and mortality

Roger Peng, Francesca Dominici, Thomas Louis

Research output: Contribution to journalArticle

Abstract

Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

Original languageEnglish (US)
Pages (from-to)179-203
Number of pages25
JournalJournal of the Royal Statistical Society. Series A: Statistics in Society
Volume169
Issue number2
DOIs
StatePublished - Mar 2006

Fingerprint

Model Choice
Air Pollution
air pollution
Mortality
time series
mortality
Time series
Confounding
Particulate Matter
Count
Morbidity
trend
Pollution
Weather
morbidity
Estimate
Poisson Regression
Calendar
Air Quality
time series analysis

Keywords

  • Air pollution
  • Log-linear regression
  • Mortality
  • Semiparametric regression
  • Time series

ASJC Scopus subject areas

  • Statistics and Probability
  • Economics and Econometrics
  • Social Sciences (miscellaneous)

Cite this

Model choice in time series studies of air pollution and mortality. / Peng, Roger; Dominici, Francesca; Louis, Thomas.

In: Journal of the Royal Statistical Society. Series A: Statistics in Society, Vol. 169, No. 2, 03.2006, p. 179-203.

Research output: Contribution to journalArticle

@article{66d8311eb8664af6b42c6bdeee5ba667,
title = "Model choice in time series studies of air pollution and mortality",
abstract = "Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95{\%} posterior intervals always excluding zero risk.",
keywords = "Air pollution, Log-linear regression, Mortality, Semiparametric regression, Time series",
author = "Roger Peng and Francesca Dominici and Thomas Louis",
year = "2006",
month = "3",
doi = "10.1111/j.1467-985X.2006.00410.x",
language = "English (US)",
volume = "169",
pages = "179--203",
journal = "Journal of the Royal Statistical Society. Series A: Statistics in Society",
issn = "0964-1998",
publisher = "Wiley-Blackwell",
number = "2",

}

TY - JOUR

T1 - Model choice in time series studies of air pollution and mortality

AU - Peng, Roger

AU - Dominici, Francesca

AU - Louis, Thomas

PY - 2006/3

Y1 - 2006/3

N2 - Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

AB - Multicity time series studies of particulate matter and mortality and morbidity have provided evidence that daily variation in air pollution levels is associated with daily variation in mortality counts. These findings served as key epidemiological evidence for the recent review of the US national ambient air quality standards for particulate matter. As a result, methodological issues concerning time series analysis of the relationship between air pollution and health have attracted the attention of the scientific community and critics have raised concerns about the adequacy of current model formulations. Time series data on pollution and mortality are generally analysed by using log-linear, Poisson regression models for overdispersed counts with the daily number of deaths as outcome, the (possibly lagged) daily level of pollution as a linear predictor and smooth functions of weather variables and calendar time used to adjust for time-varying confounders. Investigators around the world have used different approaches to adjust for confounding, making it difficult to compare results across studies. To date, the statistical properties of these different approaches have not been comprehensively compared. To address these issues, we quantify and characterize model uncertainty and model choice in adjusting for seasonal and long-term trends in time series models of air pollution and mortality. First, we conduct a simulation study to compare and describe the properties of statistical methods that are commonly used for confounding adjustment. We generate data under several confounding scenarios and systematically compare the performance of the various methods with respect to the mean-squared error of the estimated air pollution coefficient. We find that the bias in the estimates generally decreases with more aggressive smoothing and that model selection methods which optimize prediction may not be suitable for obtaining an estimate with small bias. Second, we apply and compare the modelling approaches with the National Morbidity, Mortality, and Air Pollution Study database which comprises daily time series of several pollutants, weather variables and mortality counts covering the period 1987-2000 for the largest 100 cities in the USA. When applying these approaches to adjusting for seasonal and long-term trends we find that the Study's estimates for the national average effect of PM10 at lag 1 on mortality vary over approximately a twofold range, with 95% posterior intervals always excluding zero risk.

KW - Air pollution

KW - Log-linear regression

KW - Mortality

KW - Semiparametric regression

KW - Time series

UR - http://www.scopus.com/inward/record.url?scp=33644794010&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=33644794010&partnerID=8YFLogxK

U2 - 10.1111/j.1467-985X.2006.00410.x

DO - 10.1111/j.1467-985X.2006.00410.x

M3 - Article

AN - SCOPUS:33644794010

VL - 169

SP - 179

EP - 203

JO - Journal of the Royal Statistical Society. Series A: Statistics in Society

JF - Journal of the Royal Statistical Society. Series A: Statistics in Society

SN - 0964-1998

IS - 2

ER -