Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques

Felix Yu, Gianluca Silva Croso, Tae Soo Kim, Ziang Song, Felix Parker, Gregory Hager, Austin Reiter, S. Swaroop Vedula, Haider Ali, Shameema Sikder

Research output: Contribution to journalArticle

Abstract

Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision. Results: Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, -0.040; 95% CI, -0.049 to -0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95% CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95% CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95% CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963. Conclusions and Relevance: Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.

Original languageEnglish (US)
Pages (from-to)e191860
JournalJAMA network open
Volume2
Issue number4
DOIs
StatePublished - Apr 5 2019

Fingerprint

Cataract
Learning
ROC Curve
Ophthalmology
Internship and Residency
Mental Competency
Machine Learning
Public Health
Cross-Sectional Studies
Demography
Outcome Assessment (Health Care)
Sensitivity and Specificity
Wounds and Injuries
Surgeons
Support Vector Machine

Cite this

Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques. / Yu, Felix; Silva Croso, Gianluca; Kim, Tae Soo; Song, Ziang; Parker, Felix; Hager, Gregory; Reiter, Austin; Vedula, S. Swaroop; Ali, Haider; Sikder, Shameema.

In: JAMA network open, Vol. 2, No. 4, 05.04.2019, p. e191860.

Research output: Contribution to journalArticle

Yu, Felix ; Silva Croso, Gianluca ; Kim, Tae Soo ; Song, Ziang ; Parker, Felix ; Hager, Gregory ; Reiter, Austin ; Vedula, S. Swaroop ; Ali, Haider ; Sikder, Shameema. / Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques. In: JAMA network open. 2019 ; Vol. 2, No. 4. pp. e191860.
@article{c77e3662347d4391bb67d311ea88f106,
title = "Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques",
abstract = "Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision. Results: Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, -0.040; 95{\%} CI, -0.049 to -0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95{\%} CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95{\%} CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95{\%} CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963. Conclusions and Relevance: Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.",
author = "Felix Yu and {Silva Croso}, Gianluca and Kim, {Tae Soo} and Ziang Song and Felix Parker and Gregory Hager and Austin Reiter and Vedula, {S. Swaroop} and Haider Ali and Shameema Sikder",
year = "2019",
month = "4",
day = "5",
doi = "10.1001/jamanetworkopen.2019.1860",
language = "English (US)",
volume = "2",
pages = "e191860",
journal = "JAMA network open",
issn = "2574-3805",
publisher = "American Medical Association",
number = "4",

}

TY - JOUR

T1 - Assessment of Automated Identification of Phases in Videos of Cataract Surgery Using Machine Learning and Deep Learning Techniques

AU - Yu, Felix

AU - Silva Croso, Gianluca

AU - Kim, Tae Soo

AU - Song, Ziang

AU - Parker, Felix

AU - Hager, Gregory

AU - Reiter, Austin

AU - Vedula, S. Swaroop

AU - Ali, Haider

AU - Sikder, Shameema

PY - 2019/4/5

Y1 - 2019/4/5

N2 - Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision. Results: Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, -0.040; 95% CI, -0.049 to -0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95% CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95% CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95% CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963. Conclusions and Relevance: Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.

AB - Importance: Competence in cataract surgery is a public health necessity, and videos of cataract surgery are routinely available to educators and trainees but currently are of limited use in training. Machine learning and deep learning techniques can yield tools that efficiently segment videos of cataract surgery into constituent phases for subsequent automated skill assessment and feedback. Objective: To evaluate machine learning and deep learning algorithms for automated phase classification of manually presegmented phases in videos of cataract surgery. Design, Setting, and Participants: This was a cross-sectional study using a data set of videos from a convenience sample of 100 cataract procedures performed by faculty and trainee surgeons in an ophthalmology residency program from July 2011 to December 2017. Demographic characteristics for surgeons and patients were not captured. Ten standard labels in the procedure and 14 instruments used during surgery were manually annotated, which served as the ground truth. Exposures: Five algorithms with different input data: (1) a support vector machine input with cross-sectional instrument label data; (2) a recurrent neural network (RNN) input with a time series of instrument labels; (3) a convolutional neural network (CNN) input with cross-sectional image data; (4) a CNN-RNN input with a time series of images; and (5) a CNN-RNN input with time series of images and instrument labels. Each algorithm was evaluated with 5-fold cross-validation. Main Outcomes and Measures: Accuracy, area under the receiver operating characteristic curve, sensitivity, specificity, and precision. Results: Unweighted accuracy for the 5 algorithms ranged between 0.915 and 0.959. Area under the receiver operating characteristic curve for the 5 algorithms ranged between 0.712 and 0.773, with small differences among them. The area under the receiver operating characteristic curve for the image-only CNN-RNN (0.752) was significantly greater than that of the CNN with cross-sectional image data (0.712) (difference, -0.040; 95% CI, -0.049 to -0.033) and the CNN-RNN with images and instrument labels (0.737) (difference, 0.016; 95% CI, 0.014 to 0.018). While specificity was uniformly high for all phases with all 5 algorithms (range, 0.877 to 0.999), sensitivity ranged between 0.005 (95% CI, 0.000 to 0.015) for the support vector machine for wound closure (corneal hydration) and 0.974 (95% CI, 0.957 to 0.991) for the RNN for main incision. Precision ranged between 0.283 and 0.963. Conclusions and Relevance: Time series modeling of instrument labels and video images using deep learning techniques may yield potentially useful tools for the automated detection of phases in cataract surgery procedures.

UR - http://www.scopus.com/inward/record.url?scp=85064322772&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85064322772&partnerID=8YFLogxK

U2 - 10.1001/jamanetworkopen.2019.1860

DO - 10.1001/jamanetworkopen.2019.1860

M3 - Article

C2 - 30951163

AN - SCOPUS:85064322772

VL - 2

SP - e191860

JO - JAMA network open

JF - JAMA network open

SN - 2574-3805

IS - 4

ER -