Big-data and machine learning to revamp computational toxicology and its use in risk assessment

Thomas Luechtefeld, Craig Rowlands, Thomas Hartung

Research output: Contribution to journalArticle

Abstract

The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.

Original languageEnglish (US)
Pages (from-to)732-744
Number of pages13
JournalToxicology Research
Volume7
Issue number5
DOIs
StatePublished - Jan 1 2018

Fingerprint

Risk assessment
Toxicology
Learning systems
Databases
Benchmarking
Data Mining
Quantitative Structure-Activity Relationship
Health hazards
Computational methods
Animal Models
Data mining
Technology
Hazards
Skin
Animals
Machine Learning
Big data
Health

ASJC Scopus subject areas

  • Toxicology
  • Health, Toxicology and Mutagenesis

Cite this

Big-data and machine learning to revamp computational toxicology and its use in risk assessment. / Luechtefeld, Thomas; Rowlands, Craig; Hartung, Thomas.

In: Toxicology Research, Vol. 7, No. 5, 01.01.2018, p. 732-744.

Research output: Contribution to journalArticle

@article{dcaea000103c40449a9427b146ad6428,
title = "Big-data and machine learning to revamp computational toxicology and its use in risk assessment",
abstract = "The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80{\%} with specificities >70{\%} in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.",
author = "Thomas Luechtefeld and Craig Rowlands and Thomas Hartung",
year = "2018",
month = "1",
day = "1",
doi = "10.1039/c8tx00051d",
language = "English (US)",
volume = "7",
pages = "732--744",
journal = "Toxicology Research",
issn = "2045-452X",
publisher = "Royal Society of Chemistry",
number = "5",

}

TY - JOUR

T1 - Big-data and machine learning to revamp computational toxicology and its use in risk assessment

AU - Luechtefeld, Thomas

AU - Rowlands, Craig

AU - Hartung, Thomas

PY - 2018/1/1

Y1 - 2018/1/1

N2 - The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.

AB - The creation of large toxicological databases and advances in machine-learning techniques have empowered computational approaches in toxicology. Work with these large databases based on regulatory data has allowed reproducibility assessment of animal models, which highlight weaknesses in traditional in vivo methods. This should lower the bars for the introduction of new approaches and represents a benchmark that is achievable for any alternative method validated against these methods. Quantitative Structure Activity Relationships (QSAR) models for skin sensitization, eye irritation, and other human health hazards based on these big databases, however, also have made apparent some of the challenges facing computational modeling, including validation challenges, model interpretation issues, and model selection issues. A first implementation of machine learning-based predictions termed REACHacross achieved unprecedented sensitivities of >80% with specificities >70% in predicting the six most common acute and topical hazards covering about two thirds of the chemical universe. While this is awaiting formal validation, it demonstrates the new quality introduced by big data and modern data-mining technologies. The rapid increase in the diversity and number of computational models, as well as the data they are based on, create challenges and opportunities for the use of computational methods.

UR - http://www.scopus.com/inward/record.url?scp=85052761187&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85052761187&partnerID=8YFLogxK

U2 - 10.1039/c8tx00051d

DO - 10.1039/c8tx00051d

M3 - Article

C2 - 30310652

AN - SCOPUS:85052761187

VL - 7

SP - 732

EP - 744

JO - Toxicology Research

JF - Toxicology Research

SN - 2045-452X

IS - 5

ER -