Toward computer vision systems that understand real-world assembly processes

Jonathan D. Jones, Gregory Hager, Sanjeev Khudanpur

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Many applications of computer vision require robust systems that can parse complex structures as they evolve in time. Using a block construction task as a case study, we illustrate the main components involved in building such systems. We evaluate performance at three increasingly-detailed levels of spatial granularity on two multimodal (RGBD + IMU) datasets. On the first, designed to match the assumptions of the model, we report better than 90% accuracy at the finest level of granularity. On the second, designed to test the robustness of our model under adverse, real-world conditions, we report 67% accuracy and 91% precision at the mid-level of granularity. We show that this seemingly simple process presents many opportunities to expand the frontiers of computer vision and action recognition.

Original languageEnglish (US)
Title of host publicationProceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019
PublisherInstitute of Electrical and Electronics Engineers Inc.
Pages426-434
Number of pages9
ISBN (Electronic)9781728119755
DOIs
StatePublished - Mar 4 2019
Event19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019 - Waikoloa Village, United States
Duration: Jan 7 2019Jan 11 2019

Publication series

NameProceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

Conference

Conference19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019
CountryUnited States
CityWaikoloa Village
Period1/7/191/11/19

Fingerprint

Computer vision

ASJC Scopus subject areas

  • Computer Vision and Pattern Recognition
  • Computer Science Applications

Cite this

Jones, J. D., Hager, G., & Khudanpur, S. (2019). Toward computer vision systems that understand real-world assembly processes. In Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019 (pp. 426-434). [8659114] (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019). Institute of Electrical and Electronics Engineers Inc.. https://doi.org/10.1109/WACV.2019.00051

Toward computer vision systems that understand real-world assembly processes. / Jones, Jonathan D.; Hager, Gregory; Khudanpur, Sanjeev.

Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc., 2019. p. 426-434 8659114 (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Jones, JD, Hager, G & Khudanpur, S 2019, Toward computer vision systems that understand real-world assembly processes. in Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019., 8659114, Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019, Institute of Electrical and Electronics Engineers Inc., pp. 426-434, 19th IEEE Winter Conference on Applications of Computer Vision, WACV 2019, Waikoloa Village, United States, 1/7/19. https://doi.org/10.1109/WACV.2019.00051
Jones JD, Hager G, Khudanpur S. Toward computer vision systems that understand real-world assembly processes. In Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc. 2019. p. 426-434. 8659114. (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019). https://doi.org/10.1109/WACV.2019.00051
Jones, Jonathan D. ; Hager, Gregory ; Khudanpur, Sanjeev. / Toward computer vision systems that understand real-world assembly processes. Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019. Institute of Electrical and Electronics Engineers Inc., 2019. pp. 426-434 (Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019).
@inproceedings{7e2f506da74e49648ddd79dc13bf1263,
title = "Toward computer vision systems that understand real-world assembly processes",
abstract = "Many applications of computer vision require robust systems that can parse complex structures as they evolve in time. Using a block construction task as a case study, we illustrate the main components involved in building such systems. We evaluate performance at three increasingly-detailed levels of spatial granularity on two multimodal (RGBD + IMU) datasets. On the first, designed to match the assumptions of the model, we report better than 90{\%} accuracy at the finest level of granularity. On the second, designed to test the robustness of our model under adverse, real-world conditions, we report 67{\%} accuracy and 91{\%} precision at the mid-level of granularity. We show that this seemingly simple process presents many opportunities to expand the frontiers of computer vision and action recognition.",
author = "Jones, {Jonathan D.} and Gregory Hager and Sanjeev Khudanpur",
year = "2019",
month = "3",
day = "4",
doi = "10.1109/WACV.2019.00051",
language = "English (US)",
series = "Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019",
publisher = "Institute of Electrical and Electronics Engineers Inc.",
pages = "426--434",
booktitle = "Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019",

}

TY - GEN

T1 - Toward computer vision systems that understand real-world assembly processes

AU - Jones, Jonathan D.

AU - Hager, Gregory

AU - Khudanpur, Sanjeev

PY - 2019/3/4

Y1 - 2019/3/4

N2 - Many applications of computer vision require robust systems that can parse complex structures as they evolve in time. Using a block construction task as a case study, we illustrate the main components involved in building such systems. We evaluate performance at three increasingly-detailed levels of spatial granularity on two multimodal (RGBD + IMU) datasets. On the first, designed to match the assumptions of the model, we report better than 90% accuracy at the finest level of granularity. On the second, designed to test the robustness of our model under adverse, real-world conditions, we report 67% accuracy and 91% precision at the mid-level of granularity. We show that this seemingly simple process presents many opportunities to expand the frontiers of computer vision and action recognition.

AB - Many applications of computer vision require robust systems that can parse complex structures as they evolve in time. Using a block construction task as a case study, we illustrate the main components involved in building such systems. We evaluate performance at three increasingly-detailed levels of spatial granularity on two multimodal (RGBD + IMU) datasets. On the first, designed to match the assumptions of the model, we report better than 90% accuracy at the finest level of granularity. On the second, designed to test the robustness of our model under adverse, real-world conditions, we report 67% accuracy and 91% precision at the mid-level of granularity. We show that this seemingly simple process presents many opportunities to expand the frontiers of computer vision and action recognition.

UR - http://www.scopus.com/inward/record.url?scp=85063584580&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=85063584580&partnerID=8YFLogxK

U2 - 10.1109/WACV.2019.00051

DO - 10.1109/WACV.2019.00051

M3 - Conference contribution

AN - SCOPUS:85063584580

T3 - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

SP - 426

EP - 434

BT - Proceedings - 2019 IEEE Winter Conference on Applications of Computer Vision, WACV 2019

PB - Institute of Electrical and Electronics Engineers Inc.

ER -