HEAL-SWIN: A Vision Transformer on the Sphere

Oscar Carlsson; Jan Gerken; Hampus Linander; Heiner Spieß; Fredrik Ohlsson; Christoffer Petersson; Daniel Persson

doi:10.1109/CVPR52733.2024.00580

HEAL-SWIN: A Vision Transformer on the Sphere
Paper i proceeding, 2024

High-resolution wide-angle fisheye images are becoming more and more important for robotics applications such as autonomous driving. However, using ordinary convolutional neural networks or vision transformers on this data is problematic due to projection and distortion losses introduced when projecting to a rectangular grid on the plane. We introduce the HEAL-SWIN transformer, which combines the highly uniform Hierarchi-cal Equal Area iso-Latitude Pixelation (HEALPix) grid used in astrophysics and cosmology with the Hierarchical Shifted-Window (SWIN) transformer to yield an efficient and flexible model capable of training on high-resolution, distortion-free spherical data. In HEAL-SWIN, the nested structure of the HEALPix grid is used to perform the patching and windowing operations of the SWIN transformer, enabling the network to process spherical representations with minimal computational overhead. We demonstrate the superior performance of our model on both synthetic and real automotive datasets, as well as a selection of other image datasets, for semantic segmentation, depth regression and classification tasks. Our code is publicly available11https://github.com/JanEGerken/HEAL-SWIN.

semantic segmentation

fisheye images

transformer

image classification

spherical grid

depth estimation

omni-directional images

Författare

Oscar Carlsson

Chalmers, Matematiska vetenskaper, Algebra och geometri

Forskning Andra publikationer

Jan Gerken

Chalmers, Matematiska vetenskaper, Algebra och geometri

Forskning Andra publikationer

Hampus Linander

Chalmers, Matematiska vetenskaper, Algebra och geometri

Forskning Andra publikationer

Heiner Spieß

Technische Universität Berlin

Fredrik Ohlsson

Umeå universitet

Christoffer Petersson

Chalmers, Matematiska vetenskaper, Algebra och geometri

Zenseact AB

Forskning Andra publikationer

Daniel Persson

Chalmers, Matematiska vetenskaper, Algebra och geometri

Forskning Andra publikationer

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

10636919 (ISSN)

6067-6077
9798350353006 (ISBN)

2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2024
Seattle, USA,

Ämneskategorier (SSIF 2011)

Datorseende och robotik (autonoma system)

DOI

10.1109/CVPR52733.2024.00580

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2024-11-06

HEAL-SWIN: A Vision Transformer on the Sphere Paper i proceeding, 2024

Författare

Oscar Carlsson

Jan Gerken

Hampus Linander

Heiner Spieß

Fredrik Ohlsson

Christoffer Petersson

Daniel Persson

Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

HEAL-SWIN: A Vision Transformer on the Sphere
Paper i proceeding, 2024