AMST: Alternating Multimodal Skip Training
Paper i proceeding, 2025

Multimodal Learning is one of the many fields in Machine Learning where models leverage the combination of various modalities to enhance learning outcomes. However, modalities may differ in data representation and complexity, which can lead to learning imbalances during the training process. The time it takes for a certain modality to converge during training is a crucial metric to determine modality imbalance. Given differences in convergence rates, different modalities may harmfully interfere with each other’s learning process when simultaneously trained, as is commonly done in a multimodal scenario. To mitigate this negative impact, we propose Alternating Multimodal Skip Training (AMST) where the training frequency is adjusted for each specific modality. This novel method not only improves performance in conventional multimodal models that learn with fused modalities but also enhances alternating models that train each modality separately. Additionally, it outperforms state-of-the-art models while reducing training times.

modality convergence rate

skip training

multimodal learning

modality bias

modality imbalance

training frequency

Författare

Hugo Manuel Alves Henriques E Silva

Chalmers, Data- och informationsteknik, Data Science och AI

Hongguang Chen

Chalmers, Data- och informationsteknik

Selpi Selpi

Chalmers, Data- och informationsteknik, Data Science och AI

Machine Learning and Knowledge Discovery in Databases. Research Track. ECML PKDD 2025

European Conference, ECML PKDD 2025
Porto, Portugal,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2025)

Datavetenskap (datalogi)

Infrastruktur

C3SE (-2020, Chalmers Centre for Computational Science and Engineering)

Mer information

Skapat

2025-05-28