RoMa: Robust Dense Feature Matching
Paper i proceeding, 2024

Feature matching is an important computer vision task that involves estimating correspondences between two images of a 3D scene, and dense methods estimate all such correspondences. The aim is to learn a robust model, i.e., a model able to match under challenging real-world changes. In this work, we propose such a model, leveraging frozen pretrained features from the foundation model DINOv2. Al-though these features are significantly more robust than local features trained from scratch, they are inherently coarse. We therefore combine them with specialized ConvNet fine features, creating a precisely localizable feature pyramid. To further improve robustness, we propose a tailored transformer match decoder that predicts anchor probabilities, which enables it to express multimodality. Finally, we propose an improved loss formulation through regression-by-classification with subsequent robust regression. We conduct a comprehensive set of experiments that show that our method, RoMa, achieves significant gains, setting a new state-of-the-art. In particular, we achieve a 36% improvement on the extremely challenging WxBS benchmark. Code is provided at github.com/Parskatt/RoMa.

feature matching

image matching

dense feature matching

dense matching

geometry estimation

3D vision

two-view geometry

Författare

Johan Edstedt

Linköpings universitet

Qiyu Sun

East China University of Science and Technology

Georg Bökman

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Mårten Wadenbäck

Linköpings universitet

Michael Felsberg

Linköpings universitet

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

10636919 (ISSN)

19790-19800
9798350353006 (ISBN)

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Seattle, USA,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2025)

Datorgrafik och datorseende

Datavetenskap (datalogi)

Infrastruktur

Chalmers e-Commons (inkl. C3SE, 2020-)

DOI

10.1109/CVPR52733.2024.01871

Mer information

Senast uppdaterat

2025-05-20