A Unified Framework for Multi-view Multi-class Object Pose Estimation

Chi Li, Jin Bai, Gregory D. Hager

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One[NOSPACE] [NOSPACE][SPACE]core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsYair Weiss, Vittorio Ferrari, Cristian Sminchisescu, Martial Hebert
PublisherSpringer Verlag
Pages263-281
Number of pages19
ISBN (Print)9783030012694
DOIs
StatePublished - 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: Sep 8 2018Sep 14 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11220 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period9/8/189/14/18

Keywords

  • Deep learning
  • Multi-view recognition
  • Object pose estimation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Fingerprint Dive into the research topics of 'A Unified Framework for Multi-view Multi-class Object Pose Estimation'. Together they form a unique fingerprint.

  • Cite this

    Li, C., Bai, J., & Hager, G. D. (2018). A Unified Framework for Multi-view Multi-class Object Pose Estimation. In Y. Weiss, V. Ferrari, C. Sminchisescu, & M. Hebert (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 263-281). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11220 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01270-0_16