A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery

Narges Ahmidi; Lingling Tao; Shahin Sefati; Yixin Gao; Colin Lea; Benjamin Bejar Haro; Luca Zappella; Sanjeev Khudanpur; Rene Vidal; Gregory D. Hager

doi:10.1109/TBME.2016.2647680

A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery

Narges Ahmidi, Lingling Tao, Shahin Sefati, Yixin Gao, Colin Lea, Benjamin Bejar Haro, Luca Zappella, Sanjeev Khudanpur, Rene Vidal, Gregory D. Hager

Whiting School of Engineering

Research output: Contribution to journal › Article › peer-review

87 Scopus citations

Abstract

Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.

Original language	English (US)
Article number	7805258
Pages (from-to)	2025-2041
Number of pages	17
Journal	IEEE Transactions on Biomedical Engineering
Volume	64
Issue number	9
DOIs	https://doi.org/10.1109/TBME.2016.2647680
State	Published - Sep 2017

Keywords

Activity recognition
benchmark robotic dataset
kinematics and video
surgical motion

ASJC Scopus subject areas

Biomedical Engineering

Access to Document

10.1109/TBME.2016.2647680

Cite this

@article{bdf2801a9f62444f8f23d9b56b568c36,

title = "A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery",

abstract = "Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.",

keywords = "Activity recognition, benchmark robotic dataset, kinematics and video, surgical motion",

author = "Narges Ahmidi and Lingling Tao and Shahin Sefati and Yixin Gao and Colin Lea and Haro, {Benjamin Bejar} and Luca Zappella and Sanjeev Khudanpur and Rene Vidal and Hager, {Gregory D.}",

note = "Funding Information: Manuscript received September 14, 2016; revised November 16, 2016; accepted December 23, 2016. Date of publication January 4, 2017; date of current version August 18, 2017. This work was supported in part by NIH under Grant 1R01-DE025265, in part by the NSF under Grant 0534359, Grant IIS-0748338, Grant OIA 0941362, and Grant CSN 0931805, in part by the Sloan Foundation, and in part by the NSF Graduate Research Fellowship Program. The work of B. B Haro was supported in part by the Talentia Fellowship Program of the Andalusian Regional Ministry of Economy, Innovation and Science. The work of C. Lea was supported in part by NSF under Grant DGE-1232825. The work of R. Vidal was supported in part by European Research Council grant VideoWorld. Asterisk indicates corresponding author. Publisher Copyright: {\textcopyright} 1964-2012 IEEE.",

year = "2017",

month = sep,

doi = "10.1109/TBME.2016.2647680",

language = "English (US)",

volume = "64",

pages = "2025--2041",

journal = "IEEE Transactions on Biomedical Engineering",

issn = "0018-9294",

publisher = "IEEE Computer Society",

number = "9",

}

TY - JOUR

T1 - A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery

AU - Ahmidi, Narges

AU - Tao, Lingling

AU - Sefati, Shahin

AU - Gao, Yixin

AU - Lea, Colin

AU - Haro, Benjamin Bejar

AU - Zappella, Luca

AU - Khudanpur, Sanjeev

AU - Vidal, Rene

AU - Hager, Gregory D.

N1 - Funding Information: Manuscript received September 14, 2016; revised November 16, 2016; accepted December 23, 2016. Date of publication January 4, 2017; date of current version August 18, 2017. This work was supported in part by NIH under Grant 1R01-DE025265, in part by the NSF under Grant 0534359, Grant IIS-0748338, Grant OIA 0941362, and Grant CSN 0931805, in part by the Sloan Foundation, and in part by the NSF Graduate Research Fellowship Program. The work of B. B Haro was supported in part by the Talentia Fellowship Program of the Andalusian Regional Ministry of Economy, Innovation and Science. The work of C. Lea was supported in part by NSF under Grant DGE-1232825. The work of R. Vidal was supported in part by European Research Council grant VideoWorld. Asterisk indicates corresponding author. Publisher Copyright: © 1964-2012 IEEE.

PY - 2017/9

Y1 - 2017/9

N2 - Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.

AB - Objective: State-of-the-art techniques for surgical data analysis report promising results for automated skill assessment and action recognition. The contributions of many of these techniques, however, are limited to study-specific data and validation metrics, making assessment of progress across the field extremely challenging. Methods: In this paper, we address two major problems for surgical data analysis: First, lack of uniform-shared datasets and benchmarks, and second, lack of consistent validation processes. We address the former by presenting the JHU-ISI Gesture and Skill Assessment Working Set (JIGSAWS), a public dataset that we have created to support comparative research benchmarking. JIGSAWS contains synchronized video and kinematic data from multiple performances of robotic surgical tasks by operators of varying skill. We address the latter by presenting a well-documented evaluation methodology and reporting results for six techniques for automated segmentation and classification of time-series data on JIGSAWS. These techniques comprise four temporal approaches for joint segmentation and classification: hidden Markov model, sparse hidden Markov model (HMM), Markov semi-Markov conditional random field, and skip-chain conditional random field; and two feature-based ones that aim to classify fixed segments: bag of spatiotemporal features and linear dynamical systems. Results: Most methods recognize gesture activities with approximately 80% overall accuracy under both leave-one-super-trial-out and leave-one-user-out cross-validation settings. Conclusion: Current methods show promising results on this shared dataset, but room for significant progress remains, particularly for consistent prediction of gesture activities across different surgeons. Significance: The results reported in this paper provide the first systematic and uniform evaluation of surgical activity recognition techniques on the benchmark database.

KW - Activity recognition

KW - benchmark robotic dataset

KW - kinematics and video

KW - surgical motion

UR - http://www.scopus.com/inward/record.url?scp=85021087012&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85021087012&partnerID=8YFLogxK

U2 - 10.1109/TBME.2016.2647680

DO - 10.1109/TBME.2016.2647680

M3 - Article

C2 - 28060703

AN - SCOPUS:85021087012

SN - 0018-9294

VL - 64

SP - 2025

EP - 2041

JO - IEEE Transactions on Biomedical Engineering

JF - IEEE Transactions on Biomedical Engineering

IS - 9

M1 - 7805258

ER -

A Dataset and Benchmarks for Segmentation and Recognition of Gestures in Robotic Surgery

Abstract

Keywords

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this