A Unified Framework for Multi-view Multi-class Object Pose Estimation

Chi Li, Jin Bai, Gregory Hager

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

One[NOSPACE] [NOSPACE][SPACE]core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.

Original languageEnglish (US)
Title of host publicationComputer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings
EditorsYair Weiss, Vittorio Ferrari, Cristian Sminchisescu, Martial Hebert
PublisherSpringer Verlag
Pages263-281
Number of pages19
ISBN (Print)9783030012694
DOIs
StatePublished - Jan 1 2018
Event15th European Conference on Computer Vision, ECCV 2018 - Munich, Germany
Duration: Sep 8 2018Sep 14 2018

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume11220 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Other

Other15th European Conference on Computer Vision, ECCV 2018
CountryGermany
CityMunich
Period9/8/189/14/18

Fingerprint

Pose Estimation
Multi-class
Masks
Fusion reactions
Neural networks
Tessellation
Robust Performance
Clutter
Mask
Three-dimension
Euclidean
Regularization
Fusion
Regression
Degree of freedom
Integrate
Neural Networks
Benchmark
Framework
Object

Keywords

  • Deep learning
  • Multi-view recognition
  • Object pose estimation

ASJC Scopus subject areas

  • Theoretical Computer Science
  • Computer Science(all)

Cite this

Li, C., Bai, J., & Hager, G. (2018). A Unified Framework for Multi-view Multi-class Object Pose Estimation. In Y. Weiss, V. Ferrari, C. Sminchisescu, & M. Hebert (Eds.), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings (pp. 263-281). (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11220 LNCS). Springer Verlag. https://doi.org/10.1007/978-3-030-01270-0_16

A Unified Framework for Multi-view Multi-class Object Pose Estimation. / Li, Chi; Bai, Jin; Hager, Gregory.

Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. ed. / Yair Weiss; Vittorio Ferrari; Cristian Sminchisescu; Martial Hebert. Springer Verlag, 2018. p. 263-281 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics); Vol. 11220 LNCS).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Li, C, Bai, J & Hager, G 2018, A Unified Framework for Multi-view Multi-class Object Pose Estimation. in Y Weiss, V Ferrari, C Sminchisescu & M Hebert (eds), Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), vol. 11220 LNCS, Springer Verlag, pp. 263-281, 15th European Conference on Computer Vision, ECCV 2018, Munich, Germany, 9/8/18. https://doi.org/10.1007/978-3-030-01270-0_16
Li C, Bai J, Hager G. A Unified Framework for Multi-view Multi-class Object Pose Estimation. In Weiss Y, Ferrari V, Sminchisescu C, Hebert M, editors, Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. Springer Verlag. 2018. p. 263-281. (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)). https://doi.org/10.1007/978-3-030-01270-0_16
Li, Chi ; Bai, Jin ; Hager, Gregory. / A Unified Framework for Multi-view Multi-class Object Pose Estimation. Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings. editor / Yair Weiss ; Vittorio Ferrari ; Cristian Sminchisescu ; Martial Hebert. Springer Verlag, 2018. pp. 263-281 (Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)).
@inproceedings{5f1750e76ecf4b3885dfaa61e75cca36,
title = "A Unified Framework for Multi-view Multi-class Object Pose Estimation",
abstract = "One[NOSPACE] [NOSPACE][SPACE]core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.",
keywords = "Deep learning, Multi-view recognition, Object pose estimation",
author = "Chi Li and Jin Bai and Gregory Hager",
year = "2018",
month = "1",
day = "1",
doi = "10.1007/978-3-030-01270-0_16",
language = "English (US)",
isbn = "9783030012694",
series = "Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)",
publisher = "Springer Verlag",
pages = "263--281",
editor = "Yair Weiss and Vittorio Ferrari and Cristian Sminchisescu and Martial Hebert",
booktitle = "Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings",

}

TY - GEN

T1 - A Unified Framework for Multi-view Multi-class Object Pose Estimation

AU - Li, Chi

AU - Bai, Jin

AU - Hager, Gregory

PY - 2018/1/1

Y1 - 2018/1/1

N2 - One[NOSPACE] [NOSPACE][SPACE]core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.

AB - One[NOSPACE] [NOSPACE][SPACE]core challenge in object pose estimation is to ensure accurate and robust performance for large numbers of diverse foreground objects amidst complex background clutter. In this work, we present a scalable framework for accurately inferring six Degree-of-Freedom (6-DoF) pose for a large number of object classes from single or multiple views. To learn discriminative pose features, we integrate three new capabilities into a deep Convolutional Neural Network (CNN): an inference scheme that combines both classification and pose regression based on a uniform tessellation of the Special Euclidean group in three dimensions (SE(3)), the fusion of class priors into the training process via a tiled class map, and an additional regularization using deep supervision with an object mask. Further, an efficient multi-view framework is formulated to address single-view ambiguity. We show that this framework consistently improves the performance of the single-view network. We evaluate our method on three large-scale benchmarks: YCB-Video, JHUScene-50 and ObjectNet-3D. Our approach achieves competitive or superior performance over the current state-of-the-art methods.

KW - Deep learning

KW - Multi-view recognition

KW - Object pose estimation

UR - http://www.scopus.com/inward/record.url?scp=85055093674&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85055093674&partnerID=8YFLogxK

U2 - 10.1007/978-3-030-01270-0_16

DO - 10.1007/978-3-030-01270-0_16

M3 - Conference contribution

AN - SCOPUS:85055093674

SN - 9783030012694

T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)

SP - 263

EP - 281

BT - Computer Vision – ECCV 2018 - 15th European Conference, 2018, Proceedings

A2 - Weiss, Yair

A2 - Ferrari, Vittorio

A2 - Sminchisescu, Cristian

A2 - Hebert, Martial

PB - Springer Verlag

ER -