A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients

Scott P. Robertson, Harry Quon, Ana Ponce Kiess, Joseph A. Moore, Wuyang Yang, Zhi Cheng, Sarah Afonso, Mysha Allen, Marian Richardson, Amanda Choflet, Andrew Sharabi, Todd McNutt

Research output: Contribution to journalArticle

Abstract

Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0%,1%,⋯,100%] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p <0.05) and an odds ratio of at least 1.05 (5% increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17% resulted in significant logistic regression fits (p <0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.

Original languageEnglish (US)
Pages (from-to)4329-4337
Number of pages9
JournalMedical Physics
Volume42
Issue number7
DOIs
StatePublished - Jul 1 2015

Fingerprint

Organs at Risk
Data Mining
Head and Neck Neoplasms
Databases
Larynx
Xerostomia
Mucositis
Esophagitis
Mandible
Logistic Models
Odds Ratio
Parotid Gland
Mouth Mucosa
Deglutition Disorders
Esophagus

Keywords

  • dose-outcome modeling
  • head and neck cancer
  • large-scale analytics
  • toxicity

ASJC Scopus subject areas

  • Biophysics
  • Radiology Nuclear Medicine and imaging

Cite this

A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients. / Robertson, Scott P.; Quon, Harry; Kiess, Ana Ponce; Moore, Joseph A.; Yang, Wuyang; Cheng, Zhi; Afonso, Sarah; Allen, Mysha; Richardson, Marian; Choflet, Amanda; Sharabi, Andrew; McNutt, Todd.

In: Medical Physics, Vol. 42, No. 7, 01.07.2015, p. 4329-4337.

Research output: Contribution to journalArticle

Robertson, SP, Quon, H, Kiess, AP, Moore, JA, Yang, W, Cheng, Z, Afonso, S, Allen, M, Richardson, M, Choflet, A, Sharabi, A & McNutt, T 2015, 'A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients', Medical Physics, vol. 42, no. 7, pp. 4329-4337. https://doi.org/10.1118/1.4922686
Robertson, Scott P. ; Quon, Harry ; Kiess, Ana Ponce ; Moore, Joseph A. ; Yang, Wuyang ; Cheng, Zhi ; Afonso, Sarah ; Allen, Mysha ; Richardson, Marian ; Choflet, Amanda ; Sharabi, Andrew ; McNutt, Todd. / A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients. In: Medical Physics. 2015 ; Vol. 42, No. 7. pp. 4329-4337.
@article{52a5cd3ca05b4f18a884facaabaad306,
title = "A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients",
abstract = "Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0{\%},1{\%},⋯,100{\%}] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p <0.05) and an odds ratio of at least 1.05 (5{\%} increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17{\%} resulted in significant logistic regression fits (p <0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.",
keywords = "dose-outcome modeling, head and neck cancer, large-scale analytics, toxicity",
author = "Robertson, {Scott P.} and Harry Quon and Kiess, {Ana Ponce} and Moore, {Joseph A.} and Wuyang Yang and Zhi Cheng and Sarah Afonso and Mysha Allen and Marian Richardson and Amanda Choflet and Andrew Sharabi and Todd McNutt",
year = "2015",
month = "7",
day = "1",
doi = "10.1118/1.4922686",
language = "English (US)",
volume = "42",
pages = "4329--4337",
journal = "Medical Physics",
issn = "0094-2405",
publisher = "AAPM - American Association of Physicists in Medicine",
number = "7",

}

TY - JOUR

T1 - A data-mining framework for large scale analysis of dose-outcome relationships in a database of irradiated head and neck cancer patients

AU - Robertson, Scott P.

AU - Quon, Harry

AU - Kiess, Ana Ponce

AU - Moore, Joseph A.

AU - Yang, Wuyang

AU - Cheng, Zhi

AU - Afonso, Sarah

AU - Allen, Mysha

AU - Richardson, Marian

AU - Choflet, Amanda

AU - Sharabi, Andrew

AU - McNutt, Todd

PY - 2015/7/1

Y1 - 2015/7/1

N2 - Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0%,1%,⋯,100%] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p <0.05) and an odds ratio of at least 1.05 (5% increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17% resulted in significant logistic regression fits (p <0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.

AB - Purpose: To develop a hypothesis-generating framework for automatic extraction of dose-outcome relationships from an in-house, analytic oncology database. Methods: Dose-volume histograms (DVH) and clinical outcomes have been routinely stored to the authors' database for 684 head and neck cancer patients treated from 2007 to 2014. Database queries were developed to extract outcomes that had been assessed for at least 100 patients, as well as DVH curves for organs-at-risk (OAR) that were contoured for at least 100 patients. DVH curves for paired OAR (e.g., left and right parotids) were automatically combined and included as additional structures for analysis. For each OAR-outcome combination, only patients with both OAR and outcome records were analyzed. DVH dose points, D(Vt), at a given normalized volume threshold Vt were stratified into two groups based on severity of toxicity outcomes after treatment completion. The probability of an outcome was modeled at each Vt = [0%,1%,⋯,100%] by logistic regression. Notable OAR-outcome combinations were defined as having statistically significant regression parameters (p <0.05) and an odds ratio of at least 1.05 (5% increase in odds per Gy). Results: A total of 57 individual and combined structures and 97 outcomes were queried from the database. Of all possible OAR-outcome combinations, 17% resulted in significant logistic regression fits (p <0.05) having an odds ratio of at least 1.05. Further manual inspection revealed a number of reasonable models based on either reported literature or proximity between neighboring OARs. The data-mining algorithm confirmed the following well-known OAR-dose/outcome relationships: dysphagia/larynx, voice changes/larynx, esophagitis/esophagus, xerostomia/parotid glands, and mucositis/oral mucosa. Several surrogate relationships, defined as OAR not directly attributed to an outcome, were also observed, including esophagitis/larynx, mucositis/mandible, and xerostomia/mandible. Conclusions: Prospective collection of clinical data has enabled large-scale analysis of dose-outcome relationships. The current data-mining framework revealed both known and novel dosimetric and clinical relationships, underscoring the potential utility of this analytic approach in hypothesis generation. Multivariate models and advanced, 3D dosimetric features may be necessary to further evaluate the complex relationship between neighboring OAR and observed outcomes.

KW - dose-outcome modeling

KW - head and neck cancer

KW - large-scale analytics

KW - toxicity

UR - http://www.scopus.com/inward/record.url?scp=84933059854&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84933059854&partnerID=8YFLogxK

U2 - 10.1118/1.4922686

DO - 10.1118/1.4922686

M3 - Article

VL - 42

SP - 4329

EP - 4337

JO - Medical Physics

JF - Medical Physics

SN - 0094-2405

IS - 7

ER -