Improving Language Models Using Augmentation and Multi-Modality

Tobias Norlund

Improving Language Models Using Augmentation and Multi-Modality
Licentiatavhandling, 2023

Language models have become a core component in modern Natural Language Processing (NLP) as they constitute a powerful base that is easily adaptable to many language processing tasks. Part of the strength lies in their ability to embed associations representing general world knowledge. However, the associations formed by these models are brittle, even when scaled to huge sizes and using massive amounts of data. This, in combination with other problems such as lack of attributability and high costs, motivate us to investigate other methods to improve on these aspects.

In this thesis, we investigate methods that augment language models with additional contextual information, for the purpose of simplifying the language modeling problem and increasing the formation of desirable associations. We also investigate whether multi-modal data can assist in forming such associations, that could otherwise be difficult to obtain from textual data only.

In our experiments, we showcase augmentation to be effective toward these ends, in both a textual and multi-modal case. We also demonstrate that visual data can assist in forming knowledge-representing associations in a language model.

natural language processing

contextual augmentation

multimodal language modeling

language models

Room Analysen, EDIT Building, Hörsalsvägen 11

Opponent: Pontus Stenetorp, Associate Professor, University College London, United Kingdom

Online disputation

Författare

Tobias Norlund

Chalmers, Data- och informationsteknik, Data Science och AI

Forskning Andra publikationer

Building a Swedish Open-Domain Conversational Language Model

Proceedings of the 23rd Nordic Conference on Computational Linguistics (NoDaLiDa),;(2021)p. 357-366

Paper i proceeding

Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?

BlackboxNLP 2021 - Proceedings of the 4th BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP,;(2021)p. 149-162

Paper i proceeding

Cross-modal Transfer Between Vision and Language for Protest Detection

CASE 2022 - 5th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, Proceedings of the Workshop,;(2022)p. 56-60

Paper i proceeding

On the Generalization Ability of Retrieval-Enhanced Transformers

EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics, Findings of EACL 2023,;(2023)p. 1455-1463

Paper i proceeding

Ämneskategorier (SSIF 2011)

Annan data- och informationsvetenskap

Språkteknologi (språkvetenskaplig databehandling)

Datavetenskap (datalogi)

Utgivare

Chalmers

Room Analysen, EDIT Building, Hörsalsvägen 11

Online

Opponent: Pontus Stenetorp, Associate Professor, University College London, United Kingdom

Mer information

Senast uppdaterat

2023-04-24

Improving Language Models Using Augmentation and Multi-Modality Licentiatavhandling, 2023

Författare

Tobias Norlund

Building a Swedish Open-Domain Conversational Language Model

Transferring Knowledge from Vision to Language: How to Achieve it and how to Measure it?

Cross-modal Transfer Between Vision and Language for Protest Detection

On the Generalization Ability of Retrieval-Enhanced Transformers

Ämneskategorier (SSIF 2011)

Utgivare

Mer information

Senast uppdaterat

Improving Language Models Using Augmentation and Multi-Modality
Licentiatavhandling, 2023