VICMus: Variance-Invariance-Covariance Regularization for Music Representation Learning
Paper i proceeding, 2024

Recent self-supervised learning methods often prevent informational collapse by implicitly regularizing information. Variance-Invariance-Covariance regularization (VICReg) was introduced as a non-contrastive loss function that explicitly maximizes information through regularization. While VICReg has garnered substantial interest in the field of computer vision, its application to the music domain remains unexplored. To address this gap, we introduce VICMus - VICReg for music representation learning. We pre-train VICMus on the Free Music Archive and achieve 36.3 mAP on MagnaTagaTune, outperforming Contrastive Learning of Musical Representations (CLMR), a recent contrastive method pre-trained on the ten times larger Million Song Dataset, which got 35.6 mAP. We evaluate VICMus on the Holistic Audio Representation Evaluation Suite (HARES)-music benchmark and achieve an average score of 51.7. Our results indicate that while VICMus may not yet achieve the performance of state-of-the-art self-supervised models, it offers a promising and computationally efficient avenue for music representation learning. Our code and models are available at https://github.com/SebastianLoef/VICMus.

music embeddings

representation learning

regularization

Self-supervised learning

Författare

Sebastian Löf

Epidemic Sound

Student vid Chalmers

Cody Hesse

Kommunikation och marknad

Epidemic Sound

Carl Thomé

Epidemic Sound

Carlos Lordelo

Epidemic Sound

Jens Ahrens

Chalmers, Arkitektur och samhällsbyggnadsteknik, Teknisk akustik

2024 IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2024 - Proceedings

475-479
979-8-3503-4485-1 (ISBN)

49th IEEE International Conference on Acoustics, Speech, and Signal Processing Workshops, ICASSPW 2024
Seoul, South Korea,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier

Musikvetenskap

Signalbehandling

DOI

10.1109/ICASSPW62465.2024.10627508

Mer information

Senast uppdaterat

2024-09-06