Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study

The MAL-ED Network

Research output: Contribution to journalArticle

Abstract

Background: Longitudinal and time series analyses are needed to characterize the associations between hydrometeorological parameters and health outcomes. Earth Observation (EO) climate data products derived from satellites and global model-based reanalysis have the potential to be used as surrogates in situations and locations where weather-station based observations are inadequate or incomplete. However, these products often lack direct evaluation at specific sites of epidemiological interest. Methods: Standard evaluation metrics of correlation, agreement, bias and error were applied to a set of ten hydrometeorological variables extracted from two quasi-global, commonly used climate data products – the Global Land Data Assimilation System (GLDAS) and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) - to evaluate their performance relative to weather-station derived estimates at the specific geographic locations of the eight sites in a multi-site cohort study. These metrics were calculated for both daily estimates and 7-day averages and for a rotavirus-peak-season subset. Then the variables from the two sources were each used as predictors in longitudinal regression models to test their association with rotavirus infection in the cohort after adjusting for covariates. Results: The availability and completeness of station-based validation data varied depending on the variable and study site. The performance of the two gridded climate models varied considerably within the same location and for the same variable across locations, according to different evaluation criteria and for the peak-season compared to the full dataset in ways that showed no obvious pattern. They also differed in the statistical significance of their association with the rotavirus outcome. For some variables, the station-based records showed a strong association while the EO-derived estimates showed none, while for others, the opposite was true. Conclusion: Researchers wishing to utilize publicly available climate data – whether EO-derived or station based - are advised to recognize their specific limitations both in the analysis and the interpretation of the results. Epidemiologists engaged in prospective research into environmentally driven diseases should install their own weather monitoring stations at their study sites whenever possible, in order to circumvent the constraints of choosing between distant or incomplete station data or unverified EO estimates.

Original languageEnglish (US)
Pages (from-to)91-109
Number of pages19
JournalEnvironmental research
Volume165
DOIs
StatePublished - Aug 1 2018

Fingerprint

Weather
weather station
Climate
Epidemiologic Studies
Earth (planet)
Satellites
Observation
Rotavirus
climate
Climate models
Rotavirus Infections
Geographic Locations
Time series
Hazards
Information Systems
data assimilation
Health
Availability
Infrared radiation
climate modeling

Keywords

  • Climate
  • Earth Observation data
  • Environmental epidemiology
  • Meteorological data
  • Rotavirus

ASJC Scopus subject areas

  • Biochemistry
  • Environmental Science(all)

Cite this

Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study. / The MAL-ED Network.

In: Environmental research, Vol. 165, 01.08.2018, p. 91-109.

Research output: Contribution to journalArticle

@article{a0657e1e9c024a029b2e3a2c3adf2c68,
title = "Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study",
abstract = "Background: Longitudinal and time series analyses are needed to characterize the associations between hydrometeorological parameters and health outcomes. Earth Observation (EO) climate data products derived from satellites and global model-based reanalysis have the potential to be used as surrogates in situations and locations where weather-station based observations are inadequate or incomplete. However, these products often lack direct evaluation at specific sites of epidemiological interest. Methods: Standard evaluation metrics of correlation, agreement, bias and error were applied to a set of ten hydrometeorological variables extracted from two quasi-global, commonly used climate data products – the Global Land Data Assimilation System (GLDAS) and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) - to evaluate their performance relative to weather-station derived estimates at the specific geographic locations of the eight sites in a multi-site cohort study. These metrics were calculated for both daily estimates and 7-day averages and for a rotavirus-peak-season subset. Then the variables from the two sources were each used as predictors in longitudinal regression models to test their association with rotavirus infection in the cohort after adjusting for covariates. Results: The availability and completeness of station-based validation data varied depending on the variable and study site. The performance of the two gridded climate models varied considerably within the same location and for the same variable across locations, according to different evaluation criteria and for the peak-season compared to the full dataset in ways that showed no obvious pattern. They also differed in the statistical significance of their association with the rotavirus outcome. For some variables, the station-based records showed a strong association while the EO-derived estimates showed none, while for others, the opposite was true. Conclusion: Researchers wishing to utilize publicly available climate data – whether EO-derived or station based - are advised to recognize their specific limitations both in the analysis and the interpretation of the results. Epidemiologists engaged in prospective research into environmentally driven diseases should install their own weather monitoring stations at their study sites whenever possible, in order to circumvent the constraints of choosing between distant or incomplete station data or unverified EO estimates.",
keywords = "Climate, Earth Observation data, Environmental epidemiology, Meteorological data, Rotavirus",
author = "{The MAL-ED Network} and Colston, {Josh M.} and Tahmeed Ahmed and Cloupas Mahopo and Gagandeep Kang and Margaret Kosek and {de Sousa Junior}, Francisco and Shrestha, {Prakash Sunder} and Erling Svensen and Ali Turab and Benjamin Zaitchik",
year = "2018",
month = "8",
day = "1",
doi = "10.1016/j.envres.2018.02.027",
language = "English (US)",
volume = "165",
pages = "91--109",
journal = "Environmental Research",
issn = "0013-9351",
publisher = "Academic Press Inc.",

}

TY - JOUR

T1 - Evaluating meteorological data from weather stations, and from satellites and global models for a multi-site epidemiological study

AU - The MAL-ED Network

AU - Colston, Josh M.

AU - Ahmed, Tahmeed

AU - Mahopo, Cloupas

AU - Kang, Gagandeep

AU - Kosek, Margaret

AU - de Sousa Junior, Francisco

AU - Shrestha, Prakash Sunder

AU - Svensen, Erling

AU - Turab, Ali

AU - Zaitchik, Benjamin

PY - 2018/8/1

Y1 - 2018/8/1

N2 - Background: Longitudinal and time series analyses are needed to characterize the associations between hydrometeorological parameters and health outcomes. Earth Observation (EO) climate data products derived from satellites and global model-based reanalysis have the potential to be used as surrogates in situations and locations where weather-station based observations are inadequate or incomplete. However, these products often lack direct evaluation at specific sites of epidemiological interest. Methods: Standard evaluation metrics of correlation, agreement, bias and error were applied to a set of ten hydrometeorological variables extracted from two quasi-global, commonly used climate data products – the Global Land Data Assimilation System (GLDAS) and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) - to evaluate their performance relative to weather-station derived estimates at the specific geographic locations of the eight sites in a multi-site cohort study. These metrics were calculated for both daily estimates and 7-day averages and for a rotavirus-peak-season subset. Then the variables from the two sources were each used as predictors in longitudinal regression models to test their association with rotavirus infection in the cohort after adjusting for covariates. Results: The availability and completeness of station-based validation data varied depending on the variable and study site. The performance of the two gridded climate models varied considerably within the same location and for the same variable across locations, according to different evaluation criteria and for the peak-season compared to the full dataset in ways that showed no obvious pattern. They also differed in the statistical significance of their association with the rotavirus outcome. For some variables, the station-based records showed a strong association while the EO-derived estimates showed none, while for others, the opposite was true. Conclusion: Researchers wishing to utilize publicly available climate data – whether EO-derived or station based - are advised to recognize their specific limitations both in the analysis and the interpretation of the results. Epidemiologists engaged in prospective research into environmentally driven diseases should install their own weather monitoring stations at their study sites whenever possible, in order to circumvent the constraints of choosing between distant or incomplete station data or unverified EO estimates.

AB - Background: Longitudinal and time series analyses are needed to characterize the associations between hydrometeorological parameters and health outcomes. Earth Observation (EO) climate data products derived from satellites and global model-based reanalysis have the potential to be used as surrogates in situations and locations where weather-station based observations are inadequate or incomplete. However, these products often lack direct evaluation at specific sites of epidemiological interest. Methods: Standard evaluation metrics of correlation, agreement, bias and error were applied to a set of ten hydrometeorological variables extracted from two quasi-global, commonly used climate data products – the Global Land Data Assimilation System (GLDAS) and Climate Hazards Group InfraRed Precipitation with Stations (CHIRPS) - to evaluate their performance relative to weather-station derived estimates at the specific geographic locations of the eight sites in a multi-site cohort study. These metrics were calculated for both daily estimates and 7-day averages and for a rotavirus-peak-season subset. Then the variables from the two sources were each used as predictors in longitudinal regression models to test their association with rotavirus infection in the cohort after adjusting for covariates. Results: The availability and completeness of station-based validation data varied depending on the variable and study site. The performance of the two gridded climate models varied considerably within the same location and for the same variable across locations, according to different evaluation criteria and for the peak-season compared to the full dataset in ways that showed no obvious pattern. They also differed in the statistical significance of their association with the rotavirus outcome. For some variables, the station-based records showed a strong association while the EO-derived estimates showed none, while for others, the opposite was true. Conclusion: Researchers wishing to utilize publicly available climate data – whether EO-derived or station based - are advised to recognize their specific limitations both in the analysis and the interpretation of the results. Epidemiologists engaged in prospective research into environmentally driven diseases should install their own weather monitoring stations at their study sites whenever possible, in order to circumvent the constraints of choosing between distant or incomplete station data or unverified EO estimates.

KW - Climate

KW - Earth Observation data

KW - Environmental epidemiology

KW - Meteorological data

KW - Rotavirus

UR - http://www.scopus.com/inward/record.url?scp=85045768516&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85045768516&partnerID=8YFLogxK

U2 - 10.1016/j.envres.2018.02.027

DO - 10.1016/j.envres.2018.02.027

M3 - Article

C2 - 29684739

AN - SCOPUS:85045768516

VL - 165

SP - 91

EP - 109

JO - Environmental Research

JF - Environmental Research

SN - 0013-9351

ER -