Towards Reliable and Accurate Global Structure-from-Motion
Doctoral thesis, 2023
Global and Incremental Structure-from-Motion methods appear as ways to provide good initializations to bundle adjustment, each with different properties. While Global Structure-from-Motion has been shown to result in more accurate reconstructions compared to Incremental Structure-from-Motion, the latter has better scalability by starting with a small subset of images and sequentially adding new views, allowing reconstruction of sequences with millions of images. Additionally, both Global and Incremental Structure-from-Motion methods rely on accurate models of the scene or object, and under noisy conditions or high model uncertainty might result in poor initializations for bundle adjustment. Recently pOSE, a class of matrix factorization methods, has been proposed as an alternative to conventional Global SfM methods. These methods use VarPro - a second-order optimization method - to minimize a linear combination of an approximation of reprojection errors and a regularization term based on an affine camera model, and have been shown to converge to global minima with a high rate even when starting from random camera calibration estimations.
This thesis aims at improving the reliability and accuracy of global SfM through different approaches. First, by studying conditions for global optimality of point set registration, a point cloud averaging method that can be used when (incomplete) 3D point clouds of the same scene in different coordinate systems are available. Second, by extending pOSE methods to different Structure-from-Motion problem instances, such as Non-Rigid SfM or radial distortion invariant SfM. Third and finally, by replacing the regularization term of pOSE methods with an exponential regularization on the projective depth of the 3D point estimations, resulting in a loss that achieves reconstructions with accuracy close to bundle adjustment.
bundle adjustment
point set registration
camera calibration
global SfM
Structure-from-Motion
radial distortion
3D reconstruction
non-rigid SfM
pOSE
matrix factorization
Author
José Pedro Lopes Iglesias
Chalmers, Electrical Engineering, Signal Processing and Biomedical Engineering
Global Optimality for Point Set Registration Using Semidefinite Programming
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,; Vol. 2020(2020)p. 8284-8292
Paper in proceeding
Accurate Optimization of Weighted Nuclear Norm for Non-Rigid Structure from Motion
Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics),; Vol. 12372(2020)p. 21-37
Paper in proceeding
Bilinear Parameterization for Non-Separable Singular Value Penalties
Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition,; (2021)p. 3896-3905
Paper in proceeding
Radial Distortion Invariant Factorization for Structure from Motion
Proceedings of the IEEE International Conference on Computer Vision,; (2021)p. 5886-5895
Paper in proceeding
robots or computer software with similar capabilities, the research area of computer vision has been studying ways to replicate the processes happening inside our brains using concepts from mathematics, geometry, and computer science.
Recent developments in computer vision, in particular related to artificial intelligence, have given us extremely capable learning-based models that were received by society with both amazement and cautiousness.
We can now use these models to generate incredibly realistic images from text prompts, detect objects of interest in images or videos, and ask questions about particular images to chatbots that could without a doubt pass the Turing test with flying colors.
However, learning-based methods still do not outperform conventional methods in tasks that deeply rely on exact geometric relations such as 3D reconstruction or pose estimation from images. These two tasks are of great importance in applications such as autonomous driving or augmented and virtual reality, as they enable us to estimate the 3D models of objects or scenes as well as our position and orientation in relation to them.
In this work, I build upon conventional methods based on geometry, linear algebra, and optimization, with the goal of improving the reliability and accuracy of Structure-from-Motion, a problem that solves simultaneously 3D reconstruction and pose estimation from keypoints detected and matched across multiple images of a scene. My aim is to extend the different use cases that these methods can be applied, and with that
get us slightly closer to performance levels that could unlock the next generation of real-life applications.
Optimization Methods with Performance Guarantees for Subspace Learning
Swedish Research Council (VR) (2018-05375), 2019-01-01 -- 2022-12-31.
Infrastructure
C3SE (Chalmers Centre for Computational Science and Engineering)
Subject Categories
Robotics
Signal Processing
ISBN
978-91-7905-863-0
Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 5329
Publisher
Chalmers