To Learn or Not to Learn: Visual Localization from Essential Matrices
Paper in proceedings, 2020

Visual localization is the problem of estimating a camera within a scene and a key technology for autonomous robots. State-of-the-art approaches for accurate visual localization use scene-specific representations, resulting in the overhead of constructing these models when applying the techniques to new scenes. Recently, learned approaches based on relative pose estimation have been proposed, carrying the promise of easily adapting to new scenes. However, they are currently significantly less accurate than state-of-the-art approaches. In this paper, we are interested in analyzing this behavior. To this end, we propose a novel framework for visual localization from relative poses. Using a classical feature-based approach within this framework, we show state-of-the-art performance. Replacing the classical approach with learned alternatives at various levels, we then identify the reasons for why deep learned approaches do not perform well. Based on our analysis, we make recommendations for future work.


Qunjie Zhou

Technical University of Munich

Torsten Sattler

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering, Imaging and Image Analysis

Marc Pollefeys

Swiss Federal Institute of Technology in Zürich (ETH)

Laura Leal-Taixe

Technical University of Munich

Proceedings - IEEE International Conference on Robotics and Automation

10504729 (ISSN)

3319-3326 9196607

2020 IEEE International Conference on Robotics and Automation, ICRA 2020
Paris, France,

Subject Categories


Computer Science

Computer Vision and Robotics (Autonomous Systems)



More information

Latest update