Vibrio parahaemolyticus in the Chesapeake Bay: Operational In Situ Prediction and Forecast Models Can Benefit from Inclusion of Lagged Water Quality Measurements

Benjamin Davis, John M. Jacobs, Benjamin Zaitchik, Angelo DePaola, Frank C Curriero

Research output: Contribution to journalArticle

Abstract

Vibrio parahaemolyticus is a leading cause of seafood-borne gastroenteritis. Given its natural presence in brackish waters, there is a need to develop operational forecast models that can sufficiently predict the bacterium's spatial and temporal variation. This work attempted to develop V. parahaemolyticus prediction models using frequently measured time-indexed and -lagged water quality measures. Models were built using a large data set (n = 1,043) of surface water samples from 2007 to 2010 previously analyzed for V. parahaemolyticus in the Chesapeake Bay. Water quality variables were classified as time indexed, 1-month lag, and 2-month lag. Tobit regression models were used to account for V. parahaemolyticus measures below the limit of quantification and to simultaneously estimate the presence and abundance of the bacterium. Models were evaluated using cross-validation and metrics that quantify prediction bias and uncertainty. Presence classification models containing only one type of water quality parameter (e.g., temperature) performed poorly, while models with additional water quality parameters (i.e., salinity, clarity, and dissolved oxygen) performed well. Lagged variable models performed similarly to time-indexed models, and lagged variables occasionally contained a predictive power that was independent of or superior to that of time-indexed variables. Abundance estimation models were less effective, primarily due to a restricted number of samples with abundances above the limit of quantification. These findings indicate that an operational in situ prediction model is attainable but will require a variety of water quality measurements and that lagged measurements will be particularly useful for forecasting. Future work will expand variable selection for prediction models and extend the spatial-temporal extent of predictions by using geostatistical interpolation techniques.IMPORTANCEVibrio parahaemolyticus is one of the leading causes of seafood-borne illness in the United States and across the globe. Exposure often occurs from the consumption of raw shellfish. Despite public health concerns, there have been only sporadic efforts to develop environmental prediction and forecast models for the bacterium preharvest. This analysis used commonly sampled water quality measurements of temperature, salinity, dissolved oxygen, and clarity to develop models for V. parahaemolyticus in surface water. Predictors also included measurements taken months before water was tested for the bacterium. Results revealed that the use of multiple water quality measurements is necessary for satisfactory prediction performance, challenging current efforts to manage the risk of infection based upon water temperature alone. The results also highlight the potential advantage of including historical water quality measurements. This analysis shows promise and lays the groundwork for future operational prediction and forecast models.

Original languageEnglish (US)
JournalApplied and environmental microbiology
Volume85
Issue number17
DOIs
StatePublished - Sep 1 2019

Fingerprint

Vibrio parahaemolyticus
Water Quality
Chesapeake Bay
water quality
prediction
Bacteria
Seafood
Water
Salinity
Temperature
Oxygen
bacterium
Shellfish
seafood
bacteria
forecast
in situ
seafoods
Gastroenteritis
dissolved oxygen

Keywords

  • Chesapeake Bay
  • forecast
  • prediction
  • public health
  • temporal lags
  • Tobit regression
  • Vibrio parahaemolyticus

ASJC Scopus subject areas

  • Biotechnology
  • Food Science
  • Applied Microbiology and Biotechnology
  • Ecology

Cite this

@article{106e03efc4464c7db601c05dcadc089d,
title = "Vibrio parahaemolyticus in the Chesapeake Bay: Operational In Situ Prediction and Forecast Models Can Benefit from Inclusion of Lagged Water Quality Measurements",
abstract = "Vibrio parahaemolyticus is a leading cause of seafood-borne gastroenteritis. Given its natural presence in brackish waters, there is a need to develop operational forecast models that can sufficiently predict the bacterium's spatial and temporal variation. This work attempted to develop V. parahaemolyticus prediction models using frequently measured time-indexed and -lagged water quality measures. Models were built using a large data set (n = 1,043) of surface water samples from 2007 to 2010 previously analyzed for V. parahaemolyticus in the Chesapeake Bay. Water quality variables were classified as time indexed, 1-month lag, and 2-month lag. Tobit regression models were used to account for V. parahaemolyticus measures below the limit of quantification and to simultaneously estimate the presence and abundance of the bacterium. Models were evaluated using cross-validation and metrics that quantify prediction bias and uncertainty. Presence classification models containing only one type of water quality parameter (e.g., temperature) performed poorly, while models with additional water quality parameters (i.e., salinity, clarity, and dissolved oxygen) performed well. Lagged variable models performed similarly to time-indexed models, and lagged variables occasionally contained a predictive power that was independent of or superior to that of time-indexed variables. Abundance estimation models were less effective, primarily due to a restricted number of samples with abundances above the limit of quantification. These findings indicate that an operational in situ prediction model is attainable but will require a variety of water quality measurements and that lagged measurements will be particularly useful for forecasting. Future work will expand variable selection for prediction models and extend the spatial-temporal extent of predictions by using geostatistical interpolation techniques.IMPORTANCEVibrio parahaemolyticus is one of the leading causes of seafood-borne illness in the United States and across the globe. Exposure often occurs from the consumption of raw shellfish. Despite public health concerns, there have been only sporadic efforts to develop environmental prediction and forecast models for the bacterium preharvest. This analysis used commonly sampled water quality measurements of temperature, salinity, dissolved oxygen, and clarity to develop models for V. parahaemolyticus in surface water. Predictors also included measurements taken months before water was tested for the bacterium. Results revealed that the use of multiple water quality measurements is necessary for satisfactory prediction performance, challenging current efforts to manage the risk of infection based upon water temperature alone. The results also highlight the potential advantage of including historical water quality measurements. This analysis shows promise and lays the groundwork for future operational prediction and forecast models.",
keywords = "Chesapeake Bay, forecast, prediction, public health, temporal lags, Tobit regression, Vibrio parahaemolyticus",
author = "Benjamin Davis and Jacobs, {John M.} and Benjamin Zaitchik and Angelo DePaola and Curriero, {Frank C}",
year = "2019",
month = "9",
day = "1",
doi = "10.1128/AEM.01007-19",
language = "English (US)",
volume = "85",
journal = "Applied and Environmental Microbiology",
issn = "0099-2240",
publisher = "American Society for Microbiology",
number = "17",

}

TY - JOUR

T1 - Vibrio parahaemolyticus in the Chesapeake Bay

T2 - Operational In Situ Prediction and Forecast Models Can Benefit from Inclusion of Lagged Water Quality Measurements

AU - Davis, Benjamin

AU - Jacobs, John M.

AU - Zaitchik, Benjamin

AU - DePaola, Angelo

AU - Curriero, Frank C

PY - 2019/9/1

Y1 - 2019/9/1

N2 - Vibrio parahaemolyticus is a leading cause of seafood-borne gastroenteritis. Given its natural presence in brackish waters, there is a need to develop operational forecast models that can sufficiently predict the bacterium's spatial and temporal variation. This work attempted to develop V. parahaemolyticus prediction models using frequently measured time-indexed and -lagged water quality measures. Models were built using a large data set (n = 1,043) of surface water samples from 2007 to 2010 previously analyzed for V. parahaemolyticus in the Chesapeake Bay. Water quality variables were classified as time indexed, 1-month lag, and 2-month lag. Tobit regression models were used to account for V. parahaemolyticus measures below the limit of quantification and to simultaneously estimate the presence and abundance of the bacterium. Models were evaluated using cross-validation and metrics that quantify prediction bias and uncertainty. Presence classification models containing only one type of water quality parameter (e.g., temperature) performed poorly, while models with additional water quality parameters (i.e., salinity, clarity, and dissolved oxygen) performed well. Lagged variable models performed similarly to time-indexed models, and lagged variables occasionally contained a predictive power that was independent of or superior to that of time-indexed variables. Abundance estimation models were less effective, primarily due to a restricted number of samples with abundances above the limit of quantification. These findings indicate that an operational in situ prediction model is attainable but will require a variety of water quality measurements and that lagged measurements will be particularly useful for forecasting. Future work will expand variable selection for prediction models and extend the spatial-temporal extent of predictions by using geostatistical interpolation techniques.IMPORTANCEVibrio parahaemolyticus is one of the leading causes of seafood-borne illness in the United States and across the globe. Exposure often occurs from the consumption of raw shellfish. Despite public health concerns, there have been only sporadic efforts to develop environmental prediction and forecast models for the bacterium preharvest. This analysis used commonly sampled water quality measurements of temperature, salinity, dissolved oxygen, and clarity to develop models for V. parahaemolyticus in surface water. Predictors also included measurements taken months before water was tested for the bacterium. Results revealed that the use of multiple water quality measurements is necessary for satisfactory prediction performance, challenging current efforts to manage the risk of infection based upon water temperature alone. The results also highlight the potential advantage of including historical water quality measurements. This analysis shows promise and lays the groundwork for future operational prediction and forecast models.

AB - Vibrio parahaemolyticus is a leading cause of seafood-borne gastroenteritis. Given its natural presence in brackish waters, there is a need to develop operational forecast models that can sufficiently predict the bacterium's spatial and temporal variation. This work attempted to develop V. parahaemolyticus prediction models using frequently measured time-indexed and -lagged water quality measures. Models were built using a large data set (n = 1,043) of surface water samples from 2007 to 2010 previously analyzed for V. parahaemolyticus in the Chesapeake Bay. Water quality variables were classified as time indexed, 1-month lag, and 2-month lag. Tobit regression models were used to account for V. parahaemolyticus measures below the limit of quantification and to simultaneously estimate the presence and abundance of the bacterium. Models were evaluated using cross-validation and metrics that quantify prediction bias and uncertainty. Presence classification models containing only one type of water quality parameter (e.g., temperature) performed poorly, while models with additional water quality parameters (i.e., salinity, clarity, and dissolved oxygen) performed well. Lagged variable models performed similarly to time-indexed models, and lagged variables occasionally contained a predictive power that was independent of or superior to that of time-indexed variables. Abundance estimation models were less effective, primarily due to a restricted number of samples with abundances above the limit of quantification. These findings indicate that an operational in situ prediction model is attainable but will require a variety of water quality measurements and that lagged measurements will be particularly useful for forecasting. Future work will expand variable selection for prediction models and extend the spatial-temporal extent of predictions by using geostatistical interpolation techniques.IMPORTANCEVibrio parahaemolyticus is one of the leading causes of seafood-borne illness in the United States and across the globe. Exposure often occurs from the consumption of raw shellfish. Despite public health concerns, there have been only sporadic efforts to develop environmental prediction and forecast models for the bacterium preharvest. This analysis used commonly sampled water quality measurements of temperature, salinity, dissolved oxygen, and clarity to develop models for V. parahaemolyticus in surface water. Predictors also included measurements taken months before water was tested for the bacterium. Results revealed that the use of multiple water quality measurements is necessary for satisfactory prediction performance, challenging current efforts to manage the risk of infection based upon water temperature alone. The results also highlight the potential advantage of including historical water quality measurements. This analysis shows promise and lays the groundwork for future operational prediction and forecast models.

KW - Chesapeake Bay

KW - forecast

KW - prediction

KW - public health

KW - temporal lags

KW - Tobit regression

KW - Vibrio parahaemolyticus

UR - http://www.scopus.com/inward/record.url?scp=85071355109&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85071355109&partnerID=8YFLogxK

U2 - 10.1128/AEM.01007-19

DO - 10.1128/AEM.01007-19

M3 - Article

C2 - 31253685

AN - SCOPUS:85071355109

VL - 85

JO - Applied and Environmental Microbiology

JF - Applied and Environmental Microbiology

SN - 0099-2240

IS - 17

ER -