DeDoDe: Detect, Don't Describe - Describe, Don't Detect for Local Feature Matching
Paper i proceeding, 2024

Keypoint detection is a pivotal step in 3D reconstruction, whereby sets of (up to) K points are detected in each view of a scene. Crucially, the detected points need to be consistent between views, i.e., correspond to the same 3D point in the scene. One of the main challenges with keypoint detection is the formulation of the learning objective. Previous learning-based methods typically jointly learn descriptors with keypoints, and treat the keypoint detection as a binary classification task on mutual nearest neighbours. However, basing keypoint detection on descriptor nearest neighbours is a proxy task, which is not guaranteed to produce 3D-consistent keypoints. Furthermore, this ties the keypoints to a specific descriptor, complicating downstream usage. In this work, we instead learn keypoints directly from 3D consistency. To this end, we train the detector to detect tracks from large-scale SfM. As these points are often overly sparse, we derive a semi-supervised two-view detection objective to expand this set to a desired number of detections. To train a descriptor, we maximize the mutual nearest neighbour objective over the keypoints with a separate network. Results show that our approach, DeDoDe, achieves significant gains on multiple geometry benchmarks. Code is provided at

feature matching


keypoint detection

3D reconstruction

local feature matching

image matching


Johan Edstedt

LinkBöping University

Georg Bökman

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Mårten Wadenbäck

LinkBöping University

Michael Felsberg

LinkBöping University

Proceedings - 2024 International Conference on 3D Vision, 3DV 2024

9798350362459 (ISBN)

11th International Conference on 3D Vision, 3DV 2024
Davos, Switzerland,


Robotteknik och automation

Datorseende och robotik (autonoma system)



