Learning Structure-from-Motion with Graph Attention Networks
Paper in proceeding, 2024

In this paper we tackle the problem of learning Structure-from-Motion ( SfM) through the use of graph attention networks. SfM is a classic computer vision problem that is solved though iterative minimization of reprojection errors, referred to as Bundle Adjustment (BA), starting from a good initialization. In order to obtain a good enough initialization to BA, conventional methods rely on a sequence of sub-problems (such as pairwise pose estimation, pose averaging or triangulation) which provide an initial solution that can then be refined using BA. In this work we re-place these sub-problems by learning a model that takes as input the 2D keypoints detected across multiple views, and outputs the corresponding camera poses and 3D key-point coordinates. Our model takes advantage of graph neural networks to learn SfM-specific primitives, and we show that it can be used for fast inference of the reconstruction for new and unseen sequences. The experimental results show that the proposed model outperforms competing learning-based methods, and challenges COLMAP while having lower runtime. Our code is available at: https://github.com/lucasbrynte/gasfm/.

Author

Lucas Brynte

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

José Pedro Lopes Iglesias

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Carl Olsson

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

Fredrik Kahl

Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

2575-7075 (eISSN)

4808-4817
979-8-3503-5301-3 (ISBN)

IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)
Seattle, WA, USA,

Integrering av geometri och semantik i datorseende

Swedish Research Council (VR) (2016-04445), 2017-01-01 -- 2020-12-31.

Integrering av geometri och semantik i datorseende

Swedish Research Council (VR) (2016-04445), 2017-01-01 -- 2020-12-31.

Subject Categories (SSIF 2011)

Computational Mathematics

Control Engineering

Computer Vision and Robotics (Autonomous Systems)

Infrastructure

Chalmers e-Commons

DOI

10.1109/CVPR52733.2024.00460

More information

Latest update

1/10/2025