D2-Net: A Trainable CNN for Joint Description and Detection of Local Features
Paper i proceeding, 2019

In this work we address the problem of finding reliable pixel-level correspondences under difficult imaging conditions. We propose an approach where a single convolutional neural network plays a dual role: It is simultaneously a dense feature descriptor and a feature detector. By postponing the detection to a later stage, the obtained keypoints are more stable than their traditional counterparts based on early detection of low-level structures. We show that this model can be trained using pixel correspondences extracted from readily available large-scale SfM reconstructions, without any further annotations. The proposed method obtains state-of-the-art performance on both the difficult Aachen Day-Night localization dataset and the InLoc indoor localization benchmark, as well as competitive performance on other benchmarks for image matching and 3D reconstruction.

3D reconstruction

deep learning

machine learning

local features

visual localization


Mihai Dusmanu

Ecole Normale Superieure (ENS)


Eidgenössische Technische Hochschule Zürich (ETH)

Ignacio Rocco


Ecole Normale Superieure (ENS)

Tomas Pajdla

Ceske Vysoke Uceni Technicke v Praze

Marc Pollefeys

Microsoft Mixed Reality & AI Lab - Zürich

Eidgenössische Technische Hochschule Zürich (ETH)

Josef Sivic


Ceske Vysoke Uceni Technicke v Praze

Ecole Normale Superieure (ENS)

Akihiko Torii

Tokyo Institute of Technology

Torsten Sattler

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik, Digitala bildsystem och bildanalys

2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)


IEEE / CVF Conference on Computer Vision and Pattern Recognition
Long Beach, USA,


Robotteknik och automation

Datorseende och robotik (autonoma system)



Mer information

Senast uppdaterat