TY - JOUR
T1 - Evaluating discrimination of a lung cancer risk prediction model using partial risk-score in a two-phase study
AU - Choudhury, Parichoy Pal
AU - Chaturvedi, Anil K.
AU - Chatterjee, Nilanjan
N1 - Funding Information:
The first author would like to thank Dr. Mustapha Abubakar (DCEG, NCI) for his helpful comments in improving the presentation of manuscript. The works of P. Pal Choudhury and N. Chatterjee were supported by the Patient-Centered Outcomes Research Institute (PCORI) Award (ME-1602-34530). The work of A.K. Chaturvedi was supported by the Intramural Research Program, Division of Cancer Epidemiology and Genetics, NCI, NIH, Department of Health and Human Services. The statements and opinions in this article are solely the responsibility of the authors and do not necessarily represent the views of the PCORI, its Board of Governors, or Methodology Committee or the NCI, NIH, or the Department of Health and Human Services.
Publisher Copyright:
© 2020 American Association for Cancer Research.
PY - 2020/6
Y1 - 2020/6
N2 - Background: Independent validation of risk prediction models in prospective cohorts is required for risk-stratified cancer prevention. Such studies often have a two-phase design, where information on expensive biomarkers are ascertained in a nested substudy of the original cohort. Methods: We propose a simple approach for evaluating model discrimination that accounts for incomplete follow-up and gains efficiency by using data from all individuals in the cohort irrespective of whether they were sampled in the substudy. For evaluating the AUC, we estimated probabilities of risk-scores for cases being larger than those in controls conditional on partial risk-scores, computed using partial covariate information. The proposed method was compared with an inverse probability weighted (IPW) approach that used information only from the subjects in the substudy. We evaluated age-stratified AUC of a model including questionnaire-based risk factors and inflammation biomarkers to predict 10-year risk of lung cancer using data from the Prostate, Lung, Colorectal, and Ovarian Cancer (1993–2009) trial (30,297 ever-smokers, 1,253 patients with lung cancer). Results: For estimating age-stratified AUC of the combined lung cancer risk model, the proposed method was 3.8 to 5.3 times more efficient compared with the IPW approach across the different age groups. Extensive simulation studies also demonstrated substantial efficiency gain compared with the IPW approach. Conclusions: Incorporating information from all individuals in a two-phase cohort study can substantially improve precision of discrimination measures of lung cancer risk models. Impact: Novel, simple, and practically useful methods are proposed for evaluating risk models, a critical step toward risk-stratified cancer prevention.
AB - Background: Independent validation of risk prediction models in prospective cohorts is required for risk-stratified cancer prevention. Such studies often have a two-phase design, where information on expensive biomarkers are ascertained in a nested substudy of the original cohort. Methods: We propose a simple approach for evaluating model discrimination that accounts for incomplete follow-up and gains efficiency by using data from all individuals in the cohort irrespective of whether they were sampled in the substudy. For evaluating the AUC, we estimated probabilities of risk-scores for cases being larger than those in controls conditional on partial risk-scores, computed using partial covariate information. The proposed method was compared with an inverse probability weighted (IPW) approach that used information only from the subjects in the substudy. We evaluated age-stratified AUC of a model including questionnaire-based risk factors and inflammation biomarkers to predict 10-year risk of lung cancer using data from the Prostate, Lung, Colorectal, and Ovarian Cancer (1993–2009) trial (30,297 ever-smokers, 1,253 patients with lung cancer). Results: For estimating age-stratified AUC of the combined lung cancer risk model, the proposed method was 3.8 to 5.3 times more efficient compared with the IPW approach across the different age groups. Extensive simulation studies also demonstrated substantial efficiency gain compared with the IPW approach. Conclusions: Incorporating information from all individuals in a two-phase cohort study can substantially improve precision of discrimination measures of lung cancer risk models. Impact: Novel, simple, and practically useful methods are proposed for evaluating risk models, a critical step toward risk-stratified cancer prevention.
UR - http://www.scopus.com/inward/record.url?scp=85085713453&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85085713453&partnerID=8YFLogxK
U2 - 10.1158/1055-9965.EPI-19-1574
DO - 10.1158/1055-9965.EPI-19-1574
M3 - Article
C2 - 32277002
AN - SCOPUS:85085713453
SN - 1055-9965
VL - 29
SP - 1196
EP - 1203
JO - Cancer Epidemiology Biomarkers and Prevention
JF - Cancer Epidemiology Biomarkers and Prevention
IS - 6
ER -