A three tiered approach for articulated object action modeling and recognition

Le Lu; Gregory D. Hager; Laurent Younes

A three tiered approach for articulated object action modeling and recognition

Le Lu, Gregory D. Hager, Laurent Younes

Whiting School of Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

1 Scopus citations

Abstract

Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixturemodel (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.

Original language	English (US)
Title of host publication	Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004
Publisher	Neural information processing systems foundation
ISBN (Print)	0262195348, 9780262195348
State	Published - 2005
Event	18th Annual Conference on Neural Information Processing Systems, NIPS 2004 - Vancouver, BC, Canada Duration: Dec 13 2004 → Dec 16 2004

Publication series

Name	Advances in Neural Information Processing Systems
ISSN (Print)	1049-5258

Other

Other	18th Annual Conference on Neural Information Processing Systems, NIPS 2004
Country/Territory	Canada
City	Vancouver, BC
Period	12/13/04 → 12/16/04

ASJC Scopus subject areas

Computer Networks and Communications
Information Systems
Signal Processing

Cite this

A three tiered approach for articulated object action modeling and recognition. / Lu, Le; Hager, Gregory D.; Younes, Laurent.
Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Neural information processing systems foundation, 2005. (Advances in Neural Information Processing Systems).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Lu, L, Hager, GD & Younes, L 2005, A three tiered approach for articulated object action modeling and recognition. in Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004. Advances in Neural Information Processing Systems, Neural information processing systems foundation, 18th Annual Conference on Neural Information Processing Systems, NIPS 2004, Vancouver, BC, Canada, 12/13/04.

@inproceedings{d739d5e5d30646519428ff66ff7a15b4,

title = "A three tiered approach for articulated object action modeling and recognition",

abstract = "Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixturemodel (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.",

author = "Le Lu and Hager, {Gregory D.} and Laurent Younes",

year = "2005",

language = "English (US)",

isbn = "0262195348",

series = "Advances in Neural Information Processing Systems",

publisher = "Neural information processing systems foundation",

booktitle = "Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004",

note = "18th Annual Conference on Neural Information Processing Systems, NIPS 2004 ; Conference date: 13-12-2004 Through 16-12-2004",

}

TY - GEN

T1 - A three tiered approach for articulated object action modeling and recognition

AU - Lu, Le

AU - Hager, Gregory D.

AU - Younes, Laurent

PY - 2005

Y1 - 2005

N2 - Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixturemodel (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.

AB - Visual action recognition is an important problem in computer vision. In this paper, we propose a new method to probabilistically model and recognize actions of articulated objects, such as hand or body gestures, in image sequences. Our method consists of three levels of representation. At the low level, we first extract a feature vector invariant to scale and in-plane rotation by using the Fourier transform of a circular spatial histogram. Then, spectral partitioning [20] is utilized to obtain an initial clustering; this clustering is then refined using a temporal smoothness constraint. Gaussian mixturemodel (GMM) based clustering and density estimation in the subspace of linear discriminant analysis (LDA) are then applied to thousands of image feature vectors to obtain an intermediate level representation. Finally, at the high level we build a temporal multiresolution histogram model for each action by aggregating the clustering weights of sampled images belonging to that action. We discuss how this high level representation can be extended to achieve temporal scaling invariance and to include Bi-gram or Multi-gram transition information. Both image clustering and action recognition/segmentation results are given to show the validity of our three tiered representation.

UR - http://www.scopus.com/inward/record.url?scp=70350389487&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=70350389487&partnerID=8YFLogxK

M3 - Conference contribution

AN - SCOPUS:70350389487

SN - 0262195348

SN - 9780262195348

T3 - Advances in Neural Information Processing Systems

BT - Advances in Neural Information Processing Systems 17 - Proceedings of the 2004 Conference, NIPS 2004

PB - Neural information processing systems foundation

T2 - 18th Annual Conference on Neural Information Processing Systems, NIPS 2004

Y2 - 13 December 2004 through 16 December 2004

ER -

A three tiered approach for articulated object action modeling and recognition

Abstract

Publication series

Other

ASJC Scopus subject areas

Other files and links

Fingerprint

Cite this