TY - JOUR
T1 - Plus Disease in Retinopathy of Prematurity
T2 - A Continuous Spectrum of Vascular Abnormality as a Basis of Diagnostic Variability
AU - Campbell, J. Peter
AU - Kalpathy-Cramer, Jayashree
AU - Erdogmus, Deniz
AU - Tian, Peng
AU - Kedarisetti, Dharanish
AU - Moleta, Chace
AU - Reynolds, James D.
AU - Hutcheson, Kelly
AU - Shapiro, Michael J.
AU - Repka, Michael X.
AU - Ferrone, Philip
AU - Drenser, Kimberly
AU - Horowitz, Jason
AU - Sonmez, Kemal
AU - Swan, Ryan
AU - Ostmo, Susan
AU - Jonas, Karyn E.
AU - Chan, R. V.Paul
AU - Chiang, Michael F.
AU - Chiang, Michael F.
AU - Ostmo, Susan
AU - Sonmez, Kemal
AU - Campbell, J. Peter
AU - Jonas, Karyn
AU - Horowitz, Jason
AU - Coki, Osode
AU - Eccles, Cheryl Ann
AU - Sarna, Leora
AU - Berrocal, Audina
AU - Negron, Catherin
AU - Denser, Kimberly
AU - Cumming, Kristi
AU - Osentoski, Tammy
AU - Check, Tammy
AU - Zajechowski, Mary
AU - Lee, Thomas
AU - Kruger, Evan
AU - McGovern, Kathryn
AU - Simmons, Charles
AU - Murthy, Raghu
AU - Galvis, Sharon
AU - Rotter, Jerome
AU - Chen, Ida
AU - Li, Xiaohui
AU - Taylor, Kent
AU - Roll, Kaye
AU - Kalpathy-Cramer, Jayashree
AU - Erdogmus, Deniz
AU - Martinez-Castellanos, Maria Ana
AU - Salinas-Longoria, Samantha
AU - Romero, Rafael
AU - Arriola, Andrea
AU - Olguin-Manriquez, Francisco
AU - Meraz-Gutierrez, Miroslava
AU - Dulanto-Reinoso, Carlos M.
AU - Montero-Mendoza, Cristina
N1 - Funding Information:
Supported by the National Institutes of Health, Bethesda, Maryland (grant nos.: R01 EY19474 [J.K.C., D.E., S.O., K.E.J., R.V.P.C., M.F.C.], P30 EY010572 [J.P.C., S.O., M.F.C.], R21 EY022387 (J.K.C., D.E., M.F.C.], and T32 EY23211 (M.F.C., R.S.); the National Center for Advancing Translational Sciences at the National Institutes of Health, Bethesda, MD (Oregon Clinical and Translational Research Institute grant no.: TL1TR000129); Research to Prevent Blindness, New York, New York (J.P.C., S.N.P., J.D.R., M.X.R., S.O., K.E.J., R.V.P.C., M.F.C.); and the iNsight Foundation, New York, NY (R.V.P.C., K.E.J.). The content is solely the responsibility of the authors and does not necessarily represent the official views of the National Institutes of Health. No funding organizations had any role in the design or conduct of this research. Dr. Michael F. Chiang had full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Publisher Copyright:
© 2016 American Academy of Ophthalmology
PY - 2016/11/1
Y1 - 2016/11/1
N2 - Purpose To identify patterns of interexpert discrepancy in plus disease diagnosis in retinopathy of prematurity (ROP). Design We developed 2 datasets of clinical images as part of the Imaging and Informatics in ROP study and determined a consensus reference standard diagnosis (RSD) for each image based on 3 independent image graders and the clinical examination results. We recruited 8 expert ROP clinicians to classify these images and compared the distribution of classifications between experts and the RSD. Participants Eight participating experts with more than 10 years of clinical ROP experience and more than 5 peer-reviewed ROP publications who analyzed images obtained during routine ROP screening in neonatal intensive care units. Methods Expert classification of images of plus disease in ROP. Main Outcome Measures Interexpert agreement (weighted κ statistic) and agreement and bias on ordinal classification between experts (analysis of variance [ANOVA]) and the RSD (percent agreement). Results There was variable interexpert agreement on diagnostic classifications between the 8 experts and the RSD (weighted κ, 0–0.75; mean, 0.30). The RSD agreement ranged from 80% to 94% for the dataset of 100 images and from 29% to 79% for the dataset of 34 images. However, when images were ranked in order of disease severity (by average expert classification), the pattern of expert classification revealed a consistent systematic bias for each expert consistent with unique cut points for the diagnosis of plus disease and preplus disease. The 2-way ANOVA model suggested a highly significant effect of both image and user on the average score (dataset A: P < 0.05 and adjusted R2 = 0.82; and dataset B: P < 0.05 and adjusted R2 = 0.6615). Conclusions There is wide variability in the classification of plus disease by ROP experts, which occurs because experts have different cut points for the amounts of vascular abnormality required for presence of plus and preplus disease. This has important implications for research, teaching, and patient care for ROP and suggests that a continuous ROP plus disease severity score may reflect more accurately the behavior of expert ROP clinicians and may better standardize classification in the future.
AB - Purpose To identify patterns of interexpert discrepancy in plus disease diagnosis in retinopathy of prematurity (ROP). Design We developed 2 datasets of clinical images as part of the Imaging and Informatics in ROP study and determined a consensus reference standard diagnosis (RSD) for each image based on 3 independent image graders and the clinical examination results. We recruited 8 expert ROP clinicians to classify these images and compared the distribution of classifications between experts and the RSD. Participants Eight participating experts with more than 10 years of clinical ROP experience and more than 5 peer-reviewed ROP publications who analyzed images obtained during routine ROP screening in neonatal intensive care units. Methods Expert classification of images of plus disease in ROP. Main Outcome Measures Interexpert agreement (weighted κ statistic) and agreement and bias on ordinal classification between experts (analysis of variance [ANOVA]) and the RSD (percent agreement). Results There was variable interexpert agreement on diagnostic classifications between the 8 experts and the RSD (weighted κ, 0–0.75; mean, 0.30). The RSD agreement ranged from 80% to 94% for the dataset of 100 images and from 29% to 79% for the dataset of 34 images. However, when images were ranked in order of disease severity (by average expert classification), the pattern of expert classification revealed a consistent systematic bias for each expert consistent with unique cut points for the diagnosis of plus disease and preplus disease. The 2-way ANOVA model suggested a highly significant effect of both image and user on the average score (dataset A: P < 0.05 and adjusted R2 = 0.82; and dataset B: P < 0.05 and adjusted R2 = 0.6615). Conclusions There is wide variability in the classification of plus disease by ROP experts, which occurs because experts have different cut points for the amounts of vascular abnormality required for presence of plus and preplus disease. This has important implications for research, teaching, and patient care for ROP and suggests that a continuous ROP plus disease severity score may reflect more accurately the behavior of expert ROP clinicians and may better standardize classification in the future.
UR - http://www.scopus.com/inward/record.url?scp=84994891814&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=84994891814&partnerID=8YFLogxK
U2 - 10.1016/j.ophtha.2016.07.026
DO - 10.1016/j.ophtha.2016.07.026
M3 - Article
C2 - 27591053
AN - SCOPUS:84994891814
SN - 0161-6420
VL - 123
SP - 2338
EP - 2344
JO - Ophthalmology
JF - Ophthalmology
IS - 11
ER -