TY - JOUR
T1 - A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery
AU - Ahmidi, Narges
AU - Tao, Lingling
AU - Sefati, Shahin
AU - Gao, Yixin
AU - Lea, Colin
AU - Haro, Benjamin Bejar
AU - Zappella, Luca
AU - Khudanpur, Sanjeev
AU - Vidal, Rene
AU - Hager, Gregory D.
N1 - Funding Information:
Manuscript received September 14, 2016; revised November 16, 2016; accepted December 23, 2016. Date of publication January 4, 2017; date of current version August 18, 2017. This work was supported in part by NIH under Grant 1R01-DE025265, in part by the NSF under Grant 0534359, Grant IIS-0748338, Grant OIA 0941362, and Grant CSN 0931805, in part by the Sloan Foundation, and in part by the NSF Graduate Research Fellowship Program. The work of B. B Haro was supported in part by the Talentia Fellowship Program of the Andalusian Regional Ministry of Economy, Innovation and Science. The work of C. Lea was supported in part by NSF under Grant DGE-1232825. The work of R. Vidal was supported in part by European Research Council grant VideoWorld. Asterisk indicates corresponding author.
Publisher Copyright:
© 1964-2012 IEEE.
PY - 2017/9
Y1 - 2017/9
N2 - Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.
AB - Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.
KW - Activity recognition
KW - benchmark robotic dataset
KW - kinematics and video
KW - surgical motion
UR - http://www.scopus.com/inward/record.url?scp=85021087012&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85021087012&partnerID=8YFLogxK
U2 - 10.1109/TBME.2016.2647680
DO - 10.1109/TBME.2016.2647680
M3 - Article
C2 - 28060703
AN - SCOPUS:85021087012
SN - 0018-9294
VL - 64
SP - 2025
EP - 2041
JO - IEEE Transactions on Biomedical Engineering
JF - IEEE Transactions on Biomedical Engineering
IS - 9
M1 - 7805258
ER -