VICMus: Variance-Invariance-Covariance Regularization for Music Representation Learning

Sebastian Löf; Cody Hesse; Carl Thomé; Carlos Lordelo; Jens Ahrens

doi:10.1109/ICASSPW62465.2024.10627508

VICMus: Variance-Invariance-Covariance Regularization for Music Representation Learning
Paper i proceeding, 2024

Recent self-supervised learning methods often prevent informational collapse by implicitly regularizing information. Variance-Invariance-Covariance regularization (VICReg) was introduced as a non-contrastive loss function that explicitly maximizes information through regularization. While VICReg has garnered substantial interest in the field of computer vision, its application to the music domain remains unexplored. To address this gap, we introduce VICMus - VICReg for music representation learning. We pre-train VICMus on the Free Music Archive and achieve 36.3 mAP on MagnaTagaTune, outperforming Contrastive Learning of Musical Representations (CLMR), a recent contrastive method pre-trained on the ten times larger Million Song Dataset, which got 35.6 mAP. We evaluate VICMus on the Holistic Audio Representation Evaluation Suite (HARES)-music benchmark and achieve an average score of 51.7. Our results indicate that while VICMus may not yet achieve the performance of state-of-the-art self-supervised models, it offers a promising and computationally efficient avenue for music representation learning. Our code and models are available at https://github.com/SebastianLoef/VICMus.

representation learning

Self-supervised learning

music embeddings

regularization

Författare

Sebastian Löf

Epidemic Sound AB

Student vid Chalmers

Cody Hesse

Kommunikation och marknad

Epidemic Sound AB

Forskning Andra publikationer

Carl Thomé

Epidemic Sound AB

Carlos Lordelo

Epidemic Sound AB

Jens Ahrens

Chalmers, Arkitektur och samhällsbyggnadsteknik, Teknisk akustik

Forskning Andra publikationer

2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2024 - Proceedings

475-479
979-8-3503-4485-1 (ISBN)

49th IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2024
Seoul, South Korea,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2011)

Musikvetenskap

Signalbehandling

DOI

10.1109/ICASSPW62465.2024.10627508

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2025-08-26

VICMus: Variance-Invariance-Covariance Regularization for Music Representation Learning Paper i proceeding, 2024

Författare

Sebastian Löf

Cody Hesse

Carl Thomé

Carlos Lordelo

Jens Ahrens

2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2024 - Proceedings

Styrkeområden

Ämneskategorier (SSIF 2011)

DOI

Mer information

Senast uppdaterat

VICMus: Variance-Invariance-Covariance Regularization for Music Representation Learning
Paper i proceeding, 2024