Temporal convolutional networks: A unified approach to action segmentation

Colin Lea, René Vidal, Austin Reiter, Gregory Hager

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

The dominant paradigm for video-based action segmentation is composed of two steps: first, compute low-level features for each frame using Dense Trajectories or a Convolutional Neural Network to encode local spatiotemporal information, and second, input these features into a classifier such as a Recurrent Neural Network (RNN) that captures high-level temporal relationships. While often effective, this decoupling requires specifying two separate models, each with their own complexities, and prevents capturing more nuanced long-range spatiotemporal relationships. We propose a unified approach, as demonstrated by our Temporal Convolutional Network (TCN), that hierarchically captures relationships at low-, intermediate-, and high-level time-scales. Our model achieves superior or competitive performance using video or sensor data on three public action segmentation datasets and can be trained in a fraction of the time it takes to train an RNN.

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2016 Workshops, Proceedings
PublisherSpringer Verlag
Pages47-54
Number of pages8
Volume9915 LNCS
ISBN (Print)9783319494081
DOIs
StatePublished - 2016
Event14th European Conference on Computer Vision, ECCV 2016 - Amsterdam, Netherlands
Duration: Oct 8 2016Oct 16 2016

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume9915 LNCS
ISSN (Print)03029743
ISSN (Electronic)16113349

Other

Other14th European Conference on Computer Vision, ECCV 2016
CountryNetherlands
CityAmsterdam
Period10/8/1610/16/16

    Fingerprint

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Lea, C., Vidal, R., Reiter, A., & Hager, G. (2016). Temporal convolutional networks: A unified approach to action segmentation. In Computer Vision – ECCV 2016 Workshops, Proceedings (Vol. 9915 LNCS, pp. 47-54). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 9915 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-319-49409-8_7