TY - JOUR
T1 - A Likelihood-Based Approach to Identifying Contaminated Food Products Using Sales Data
T2 - Performance and Challenges
AU - Kaufman, James
AU - Lessler, Justin
AU - Harry, April
AU - Edlund, Stefan
AU - Hu, Kun
AU - Douglas, Judith
AU - Thoens, Christian
AU - Appel, Bernd
AU - Käsbohrer, Annemarie
AU - Filter, Matthias
PY - 2014/7
Y1 - 2014/7
N2 - Foodborne disease outbreaks of recent years demonstrate that due to increasingly interconnected supply chains these type of crisis situations have the potential to affect thousands of people, leading to significant healthcare costs, loss of revenue for food companies, and-in the worst cases-death. When a disease outbreak is detected, identifying the contaminated food quickly is vital to minimize suffering and limit economic losses. Here we present a likelihood-based approach that has the potential to accelerate the time needed to identify possibly contaminated food products, which is based on exploitation of food products sales data and the distribution of foodborne illness case reports. Using a real world food sales data set and artificially generated outbreak scenarios, we show that this method performs very well for contamination scenarios originating from a single "guilty" food product. As it is neither always possible nor necessary to identify the single offending product, the method has been extended such that it can be used as a binary classifier. With this extension it is possible to generate a set of potentially "guilty" products that contains the real outbreak source with very high accuracy. Furthermore we explore the patterns of food distributions that lead to "hard-to-identify" foods, the possibility of identifying these food groups a priori, and the extent to which the likelihood-based method can be used to quantify uncertainty. We find that high spatial correlation of sales data between products may be a useful indicator for "hard-to-identify" products.
AB - Foodborne disease outbreaks of recent years demonstrate that due to increasingly interconnected supply chains these type of crisis situations have the potential to affect thousands of people, leading to significant healthcare costs, loss of revenue for food companies, and-in the worst cases-death. When a disease outbreak is detected, identifying the contaminated food quickly is vital to minimize suffering and limit economic losses. Here we present a likelihood-based approach that has the potential to accelerate the time needed to identify possibly contaminated food products, which is based on exploitation of food products sales data and the distribution of foodborne illness case reports. Using a real world food sales data set and artificially generated outbreak scenarios, we show that this method performs very well for contamination scenarios originating from a single "guilty" food product. As it is neither always possible nor necessary to identify the single offending product, the method has been extended such that it can be used as a binary classifier. With this extension it is possible to generate a set of potentially "guilty" products that contains the real outbreak source with very high accuracy. Furthermore we explore the patterns of food distributions that lead to "hard-to-identify" foods, the possibility of identifying these food groups a priori, and the extent to which the likelihood-based method can be used to quantify uncertainty. We find that high spatial correlation of sales data between products may be a useful indicator for "hard-to-identify" products.
UR - http://www.scopus.com/inward/record.url?scp=84905454878&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84905454878&partnerID=8YFLogxK
U2 - 10.1371/journal.pcbi.1003692
DO - 10.1371/journal.pcbi.1003692
M3 - Article
C2 - 24992565
AN - SCOPUS:84905454878
SN - 1553-734X
VL - 10
JO - PLoS computational biology
JF - PLoS computational biology
IS - 7
M1 - e1003692
ER -