Incremental scene understanding on dense SLAM

Chi Li; Han Xiao; Keisuke Tateno; Federico Tombari; Nassir Navab; Gregory D. Hager

doi:10.1109/IROS.2016.7759111

Incremental scene understanding on dense SLAM

Chi Li, Han Xiao, Keisuke Tateno, Federico Tombari, Nassir Navab, Gregory D. Hager

Whiting School of Engineering

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

10 Scopus citations

Abstract

We present an architecture for online, incremental scene modeling which combines a SLAM-based scene understanding framework with semantic segmentation and object pose estimation. The core of this approach comprises a probabilistic inference scheme that predicts semantic labels for object hypotheses at each new frame. From these hypotheses, recognized scene structures are incrementally constructed and tracked. Semantic labels are inferred using a multi-domain convolutional architecture which operates on the image time series and which enables efficient propagation of features as well as robust model registration. To evaluate this architecture, we introduce a large-scale RGB-D dataset JHUSEQ-25 as a new benchmark for the sequence-based scene understanding in complex and densely cluttered scenes. This dataset contains 25 RGB-D video sequences with 100,000 labeled frames in total. We validate our method on this dataset and demonstrate improved performance of semantic segmentation and 6-DoF object pose estimation compared with methods based on the single view.

Original language	English (US)
Title of host publication	IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems
Publisher	Institute of Electrical and Electronics Engineers Inc.
Pages	574-581
Number of pages	8
ISBN (Electronic)	9781509037629
DOIs	https://doi.org/10.1109/IROS.2016.7759111
State	Published - Nov 28 2016
Event	2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016 - Daejeon, Korea, Republic of Duration: Oct 9 2016 → Oct 14 2016

Publication series

Name	IEEE International Conference on Intelligent Robots and Systems
Volume	2016-November
ISSN (Print)	2153-0858
ISSN (Electronic)	2153-0866

Other

Other	2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016
Country/Territory	Korea, Republic of
City	Daejeon
Period	10/9/16 → 10/14/16

ASJC Scopus subject areas

Control and Systems Engineering
Software
Computer Vision and Pattern Recognition
Computer Science Applications

Access to Document

10.1109/IROS.2016.7759111

Cite this

Li, C., Xiao, H., Tateno, K., Tombari, F., Navab, N., & Hager, G. D. (2016). Incremental scene understanding on dense SLAM. In IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems (pp. 574-581). Article 7759111 (IEEE International Conference on Intelligent Robots and Systems; Vol. 2016-November). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/IROS.2016.7759111

Incremental scene understanding on dense SLAM. / Li, Chi; Xiao, Han; Tateno, Keisuke et al.
IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems. Institute of Electrical and Electronics Engineers Inc., 2016. p. 574-581 7759111 (IEEE International Conference on Intelligent Robots and Systems; Vol. 2016-November).

Research output: Chapter in Book/Report/Conference proceeding › Conference contribution

Li, C, Xiao, H, Tateno, K, Tombari, F, Navab, N & Hager, GD 2016, Incremental scene understanding on dense SLAM. in IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems., 7759111, IEEE International Conference on Intelligent Robots and Systems, vol. 2016-November, Institute of Electrical and Electronics Engineers Inc., pp. 574-581, 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016, Daejeon, Korea, Republic of, 10/9/16. https://doi.org/10.1109/IROS.2016.7759111

@inproceedings{6d21ebd20e6549f79f28ce0c1f5dd49c,

title = "Incremental scene understanding on dense SLAM",

abstract = "We present an architecture for online, incremental scene modeling which combines a SLAM-based scene understanding framework with semantic segmentation and object pose estimation. The core of this approach comprises a probabilistic inference scheme that predicts semantic labels for object hypotheses at each new frame. From these hypotheses, recognized scene structures are incrementally constructed and tracked. Semantic labels are inferred using a multi-domain convolutional architecture which operates on the image time series and which enables efficient propagation of features as well as robust model registration. To evaluate this architecture, we introduce a large-scale RGB-D dataset JHUSEQ-25 as a new benchmark for the sequence-based scene understanding in complex and densely cluttered scenes. This dataset contains 25 RGB-D video sequences with 100,000 labeled frames in total. We validate our method on this dataset and demonstrate improved performance of semantic segmentation and 6-DoF object pose estimation compared with methods based on the single view.",

author = "Chi Li and Han Xiao and Keisuke Tateno and Federico Tombari and Nassir Navab and Hager, {Gregory D.}",

note = "Funding Information: This work is supported by the National Science Foundation under Grant No. NRI-1227277 Publisher Copyright: {\textcopyright} 2016 IEEE. Copyright: Copyright 2017 Elsevier B.V., All rights reserved.; 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016 ; Conference date: 09-10-2016 Through 14-10-2016",

year = "2016",

month = nov,

day = "28",

doi = "10.1109/IROS.2016.7759111",

language = "English (US)",

series = "IEEE International Conference on Intelligent Robots and Systems",

publisher = "Institute of Electrical and Electronics Engineers Inc.",

pages = "574--581",

booktitle = "IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems",

}

TY - GEN

T1 - Incremental scene understanding on dense SLAM

AU - Li, Chi

AU - Xiao, Han

AU - Tateno, Keisuke

AU - Tombari, Federico

AU - Navab, Nassir

AU - Hager, Gregory D.

N1 - Funding Information: This work is supported by the National Science Foundation under Grant No. NRI-1227277 Publisher Copyright: © 2016 IEEE. Copyright: Copyright 2017 Elsevier B.V., All rights reserved.

PY - 2016/11/28

Y1 - 2016/11/28

N2 - We present an architecture for online, incremental scene modeling which combines a SLAM-based scene understanding framework with semantic segmentation and object pose estimation. The core of this approach comprises a probabilistic inference scheme that predicts semantic labels for object hypotheses at each new frame. From these hypotheses, recognized scene structures are incrementally constructed and tracked. Semantic labels are inferred using a multi-domain convolutional architecture which operates on the image time series and which enables efficient propagation of features as well as robust model registration. To evaluate this architecture, we introduce a large-scale RGB-D dataset JHUSEQ-25 as a new benchmark for the sequence-based scene understanding in complex and densely cluttered scenes. This dataset contains 25 RGB-D video sequences with 100,000 labeled frames in total. We validate our method on this dataset and demonstrate improved performance of semantic segmentation and 6-DoF object pose estimation compared with methods based on the single view.

AB - We present an architecture for online, incremental scene modeling which combines a SLAM-based scene understanding framework with semantic segmentation and object pose estimation. The core of this approach comprises a probabilistic inference scheme that predicts semantic labels for object hypotheses at each new frame. From these hypotheses, recognized scene structures are incrementally constructed and tracked. Semantic labels are inferred using a multi-domain convolutional architecture which operates on the image time series and which enables efficient propagation of features as well as robust model registration. To evaluate this architecture, we introduce a large-scale RGB-D dataset JHUSEQ-25 as a new benchmark for the sequence-based scene understanding in complex and densely cluttered scenes. This dataset contains 25 RGB-D video sequences with 100,000 labeled frames in total. We validate our method on this dataset and demonstrate improved performance of semantic segmentation and 6-DoF object pose estimation compared with methods based on the single view.

UR - http://www.scopus.com/inward/record.url?scp=85006499472&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85006499472&partnerID=8YFLogxK

U2 - 10.1109/IROS.2016.7759111

DO - 10.1109/IROS.2016.7759111

M3 - Conference contribution

AN - SCOPUS:85006499472

T3 - IEEE International Conference on Intelligent Robots and Systems

SP - 574

EP - 581

BT - IROS 2016 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems

PB - Institute of Electrical and Electronics Engineers Inc.

T2 - 2016 IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS 2016

Y2 - 9 October 2016 through 14 October 2016

ER -

Incremental scene understanding on dense SLAM

Abstract

Publication series

Other

ASJC Scopus subject areas

Access to Document

Other files and links

Fingerprint

Cite this