Multiple model evaluation absent the gold standard through model combination

Edwin S. Iversen; Giovanni Parmigiani; Sining Chen

doi:10.1198/016214507000001012

Multiple model evaluation absent the gold standard through model combination

Edwin S. Iversen, Giovanni Parmigiani, Sining Chen

School of Medicine

Research output: Contribution to journal › Article › peer-review

3 Scopus citations

Abstract

We describe a method for evaluating an ensemble of predictive models given a sample of observations comprising the model predictions and the outcome event measured with error. Our formulation allows us to simultaneously estimate measurement error parameters, true outcome - the "gold standard" - and a relative weighting of the predictive scores. We describe conditions necessary to estimate the gold standard and to calibrate these estimates and detail how our approach is related to, but distinct from, standard model combination techniques. We apply our approach to data from a study to evaluate a collection of BRCA1/BRCA2 gene mutation prediction scores. In this example, genotype is measured with error by one or more genetic assays. We estimate true genotype for each individual in the data set, operating characteristics of the commonly used genotyping procedures, and a relative weighting of the scores. Finally, we compare the scores against the gold standard genotype and find that Mendelian scores are, on average, the more refined and better calibrated of those considered and that the comparison is sensitive to measurement error in the gold standard.

Original language	English (US)
Pages (from-to)	897-909
Number of pages	13
Journal	Journal of the American Statistical Association
Volume	103
Issue number	483
DOIs	https://doi.org/10.1198/016214507000001012
State	Published - Sep 2008

Keywords

Bayesian analysis
Breast cancer susceptibility genes
Measurement error
Model combination
Model evaluation

ASJC Scopus subject areas

Statistics and Probability
Statistics, Probability and Uncertainty

Access to Document

10.1198/016214507000001012

Cite this

@article{9627ff33162a44a28985b3a5c4635175,

title = "Multiple model evaluation absent the gold standard through model combination",

abstract = "We describe a method for evaluating an ensemble of predictive models given a sample of observations comprising the model predictions and the outcome event measured with error. Our formulation allows us to simultaneously estimate measurement error parameters, true outcome - the {"}gold standard{"} - and a relative weighting of the predictive scores. We describe conditions necessary to estimate the gold standard and to calibrate these estimates and detail how our approach is related to, but distinct from, standard model combination techniques. We apply our approach to data from a study to evaluate a collection of BRCA1/BRCA2 gene mutation prediction scores. In this example, genotype is measured with error by one or more genetic assays. We estimate true genotype for each individual in the data set, operating characteristics of the commonly used genotyping procedures, and a relative weighting of the scores. Finally, we compare the scores against the gold standard genotype and find that Mendelian scores are, on average, the more refined and better calibrated of those considered and that the comparison is sensitive to measurement error in the gold standard.",

keywords = "Bayesian analysis, Breast cancer susceptibility genes, Measurement error, Model combination, Model evaluation",

author = "Iversen, {Edwin S.} and Giovanni Parmigiani and Sining Chen",

year = "2008",

month = sep,

doi = "10.1198/016214507000001012",

language = "English (US)",

volume = "103",

pages = "897--909",

journal = "Journal of the American Statistical Association",

issn = "0162-1459",

publisher = "Taylor and Francis Ltd.",

number = "483",

}

TY - JOUR

T1 - Multiple model evaluation absent the gold standard through model combination

AU - Iversen, Edwin S.

AU - Parmigiani, Giovanni

AU - Chen, Sining

PY - 2008/9

Y1 - 2008/9

N2 - We describe a method for evaluating an ensemble of predictive models given a sample of observations comprising the model predictions and the outcome event measured with error. Our formulation allows us to simultaneously estimate measurement error parameters, true outcome - the "gold standard" - and a relative weighting of the predictive scores. We describe conditions necessary to estimate the gold standard and to calibrate these estimates and detail how our approach is related to, but distinct from, standard model combination techniques. We apply our approach to data from a study to evaluate a collection of BRCA1/BRCA2 gene mutation prediction scores. In this example, genotype is measured with error by one or more genetic assays. We estimate true genotype for each individual in the data set, operating characteristics of the commonly used genotyping procedures, and a relative weighting of the scores. Finally, we compare the scores against the gold standard genotype and find that Mendelian scores are, on average, the more refined and better calibrated of those considered and that the comparison is sensitive to measurement error in the gold standard.

AB - We describe a method for evaluating an ensemble of predictive models given a sample of observations comprising the model predictions and the outcome event measured with error. Our formulation allows us to simultaneously estimate measurement error parameters, true outcome - the "gold standard" - and a relative weighting of the predictive scores. We describe conditions necessary to estimate the gold standard and to calibrate these estimates and detail how our approach is related to, but distinct from, standard model combination techniques. We apply our approach to data from a study to evaluate a collection of BRCA1/BRCA2 gene mutation prediction scores. In this example, genotype is measured with error by one or more genetic assays. We estimate true genotype for each individual in the data set, operating characteristics of the commonly used genotyping procedures, and a relative weighting of the scores. Finally, we compare the scores against the gold standard genotype and find that Mendelian scores are, on average, the more refined and better calibrated of those considered and that the comparison is sensitive to measurement error in the gold standard.

KW - Bayesian analysis

KW - Breast cancer susceptibility genes

KW - Measurement error

KW - Model combination

KW - Model evaluation

UR - http://www.scopus.com/inward/record.url?scp=54949093248&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=54949093248&partnerID=8YFLogxK

U2 - 10.1198/016214507000001012

DO - 10.1198/016214507000001012

M3 - Article

AN - SCOPUS:54949093248

SN - 0162-1459

VL - 103

SP - 897

EP - 909

JO - Journal of the American Statistical Association

JF - Journal of the American Statistical Association

IS - 483

ER -

Multiple model evaluation absent the gold standard through model combination

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this