Understanding the Limitations of CNN-based Absolute Camera Pose Regression

Torsten Sattler; Qunjie Zhou; Marc Pollefeys; Laura Leal-Taixé

doi:10.1109/CVPR.2019.00342

Understanding the Limitations of CNN-based Absolute Camera Pose Regression
Paper i proceeding, 2019

Visual localization is the task of accurate camera pose estimation in a known scene. It is a key problem in computer vision and robotics, with applications including selfdriving cars, Structure-from-Motion, SLAM, and Mixed Reality. Traditionally, the localization problem has been tackled using 3D geometry. Recently, end-to-end approaches based on convolutional neural networks have become popular. These methods learn to directly regress the camera pose from an input image. However, they do not achieve the same level of pose accuracy as 3D structure-based methods. To understand this behavior, we develop a theoretical model for camera pose regression. We use our model to predict failure cases for pose regression techniques and verify our predictions through experiments. We furthermore use our model to show that pose regression is more closely related to pose approximation via image retrieval than to accurate pose estimation via 3D structure. A key result is that current approaches do not consistently outperform a handcrafted image retrieval baseline. This clearly shows that additional research is needed before pose regression algorithms are ready to compete with structure-based methods

deep learning

machine learning

visual localization

camera pose estimation

Författare

Torsten Sattler

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Forskning Andra publikationer

Qunjie Zhou

Technische Universität München

Marc Pollefeys

Microsoft Mixed Reality & AI Lab - Zürich

Eidgenössische Technische Hochschule Zürich (ETH)

Laura Leal-Taixé

Technische Universität München

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

10636919 (ISSN)

Vol. 2019-June 3297-3307 8954331
978-172813293-8 (ISBN)

IEEE / CVF Conference on Computer Vision and Pattern Recognition
Long Beach, USA,

Ämneskategorier (SSIF 2011)

Robotteknik och automation

Datorseende och robotik (autonoma system)

DOI

10.1109/CVPR.2019.00342

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2022-04-05

Understanding the Limitations of CNN-based Absolute Camera Pose Regression Paper i proceeding, 2019