An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks

Colin Lea, Gregory Hager, René Vidal

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Automated segmentation and recognition of fine-grained activities is important for enabling new applications in industrial automation, human-robot collaboration, and surgical training. Many existing approaches to activity recognition assume that a video has already been segmented and perform classification using an abstract representation based on spatio-temporal features. While some approaches perform joint activity segmentation and recognition, they typically suffer from a poor modeling of the transitions between actions and a representation that does not incorporate contextual information about the scene. In this paper, we propose a model for action segmentation and recognition that improves upon existing work in two directions. First, we develop a variation of the Skip-Chain Conditional Random Field that captures long-range state transitions between actions by using higher-order temporal relationships. Second, we argue that in constrained environments, where the relevant set of objects is known, it is better to develop features using high-level object relationships that have semantic meaning instead of relying on abstract features. We apply our approach to a set of tasks common for training in robotic surgery: suturing, knot tying, and needle passing, and show that our method increases micro and macro accuracy by 18.46% and 44.13% relative to the state of the art on a widely used robotic surgery dataset.

Original languageEnglish (US)
Title of host publicationProceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages1123-1129
Number of pages7
ISBN (Print)9781479966820
DOIs
StatePublished - Feb 19 2015
Event2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015 - Waikoloa, United States
Duration: Jan 5 2015Jan 9 2015

Other

Other2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015
CountryUnited States
CityWaikoloa
Period1/5/151/9/15

Fingerprint

Needles
Macros
Automation
Semantics
Robots
Robotic surgery

ASJC Scopus subject areas

  • Computer Science Applications
  • Computer Vision and Pattern Recognition

Cite this

Lea, C., Hager, G., & Vidal, R. (2015). An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks. In Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015 (pp. 1123-1129). [7046008] Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV.2015.154

An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks. / Lea, Colin; Hager, Gregory; Vidal, René.

Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015. Institute of Electrical and Electronics Engineers Inc., 2015. p. 1123-1129 7046008.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Lea, C, Hager, G & Vidal, R 2015, An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks. in Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015., 7046008, Institute of Electrical and Electronics Engineers Inc., pp. 1123-1129, 2015 15th IEEE Winter Conference on Applications of Computer Vision, WACV 2015, Waikoloa, United States, 1/5/15. https://doi.org/10.1109/WACV.2015.154
Lea C, Hager G, Vidal R. An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks. In Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015. Institute of Electrical and Electronics Engineers Inc. 2015. p. 1123-1129. 7046008 https://doi.org/10.1109/WACV.2015.154
Lea, Colin ; Hager, Gregory ; Vidal, René. / An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks. Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015. Institute of Electrical and Electronics Engineers Inc., 2015. pp. 1123-1129
@inproceedings{bb69206d18204cedb03772831946038c,
title = "An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks",
abstract = "Automated segmentation and recognition of fine-grained activities is important for enabling new applications in industrial automation, human-robot collaboration, and surgical training. Many existing approaches to activity recognition assume that a video has already been segmented and perform classification using an abstract representation based on spatio-temporal features. While some approaches perform joint activity segmentation and recognition, they typically suffer from a poor modeling of the transitions between actions and a representation that does not incorporate contextual information about the scene. In this paper, we propose a model for action segmentation and recognition that improves upon existing work in two directions. First, we develop a variation of the Skip-Chain Conditional Random Field that captures long-range state transitions between actions by using higher-order temporal relationships. Second, we argue that in constrained environments, where the relevant set of objects is known, it is better to develop features using high-level object relationships that have semantic meaning instead of relying on abstract features. We apply our approach to a set of tasks common for training in robotic surgery: suturing, knot tying, and needle passing, and show that our method increases micro and macro accuracy by 18.46{\%} and 44.13{\%} relative to the state of the art on a widely used robotic surgery dataset.",
author = "Colin Lea and Gregory Hager and Ren{\'e} Vidal",
year = "2015",
month = "2",
day = "19",
doi = "10.1109/WACV.2015.154",
language = "English (US)",
isbn = "9781479966820",
pages = "1123--1129",
booktitle = "Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015",
publisher = "Institute of Electrical and Electronics Engineers Inc.",

}

TY - GEN

T1 - An improved model for segmentation and recognition of fine-grained activities with application to surgical training tasks

AU - Lea, Colin

AU - Hager, Gregory

AU - Vidal, René

PY - 2015/2/19

Y1 - 2015/2/19

N2 - Automated segmentation and recognition of fine-grained activities is important for enabling new applications in industrial automation, human-robot collaboration, and surgical training. Many existing approaches to activity recognition assume that a video has already been segmented and perform classification using an abstract representation based on spatio-temporal features. While some approaches perform joint activity segmentation and recognition, they typically suffer from a poor modeling of the transitions between actions and a representation that does not incorporate contextual information about the scene. In this paper, we propose a model for action segmentation and recognition that improves upon existing work in two directions. First, we develop a variation of the Skip-Chain Conditional Random Field that captures long-range state transitions between actions by using higher-order temporal relationships. Second, we argue that in constrained environments, where the relevant set of objects is known, it is better to develop features using high-level object relationships that have semantic meaning instead of relying on abstract features. We apply our approach to a set of tasks common for training in robotic surgery: suturing, knot tying, and needle passing, and show that our method increases micro and macro accuracy by 18.46% and 44.13% relative to the state of the art on a widely used robotic surgery dataset.

AB - Automated segmentation and recognition of fine-grained activities is important for enabling new applications in industrial automation, human-robot collaboration, and surgical training. Many existing approaches to activity recognition assume that a video has already been segmented and perform classification using an abstract representation based on spatio-temporal features. While some approaches perform joint activity segmentation and recognition, they typically suffer from a poor modeling of the transitions between actions and a representation that does not incorporate contextual information about the scene. In this paper, we propose a model for action segmentation and recognition that improves upon existing work in two directions. First, we develop a variation of the Skip-Chain Conditional Random Field that captures long-range state transitions between actions by using higher-order temporal relationships. Second, we argue that in constrained environments, where the relevant set of objects is known, it is better to develop features using high-level object relationships that have semantic meaning instead of relying on abstract features. We apply our approach to a set of tasks common for training in robotic surgery: suturing, knot tying, and needle passing, and show that our method increases micro and macro accuracy by 18.46% and 44.13% relative to the state of the art on a widely used robotic surgery dataset.

UR - http://www.scopus.com/inward/record.url?scp=84925400796&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84925400796&partnerID=8YFLogxK

U2 - 10.1109/WACV.2015.154

DO - 10.1109/WACV.2015.154

M3 - Conference contribution

AN - SCOPUS:84925400796

SN - 9781479966820

SP - 1123

EP - 1129

BT - Proceedings - 2015 IEEE Winter Conference on Applications of Computer Vision, WACV 2015

PB - Institute of Electrical and Electronics Engineers Inc.

ER -