Fine-Tuning Language Models on Dutch Protest Event Tweets
Paper in proceeding, 2024

Being able to obtain timely information about an event, like a protest, becomes increasingly more relevant with the rise of affective polarisation and social unrest over the world. Nowadays, large-scale protests tend to be organised and broadcast through social media. Analysing social media platforms like X has proven to be an effective method to follow events during a protest. Thus, we trained several language models on Dutch tweets to analyse their ability to classify if a tweet expresses discontent, considering these tweets may contain practical information about a protest. Our results show that models pre-trained on Twitter data, including Bernice and TwHIN-BERT, outperform models that are not. Additionally, the results showed that Sentence Transformers is a promising model. The added value of oversampling is greater for models that were not trained on Twitter data. In line with previous work, preprocessing the data did not help a transformer language model to make better predictions.

Author

Meagan Loerakker

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

Netherlands Police

Laurens H.F. Müter

Netherlands Police

Utrecht University

Marijn P. Schraagen

Utrecht University

CASE 2024 - 7th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, Proceedings of the Workshop

6-23
9798891760707 (ISBN)

7th Workshop on Challenges and Applications of Automated Extraction of Socio-Political Events from Text, CASE 2024
St. Julian's, Malta,

PAPACUI: Proficiency Awareness in Physical ACtivity User Interfaces

Swedish Research Council (VR) (2022-03196), 2023-01-01 -- 2026-12-31.

Subject Categories

Language Technology (Computational Linguistics)

General Language Studies and Linguistics

Specific Languages

More information

Latest update

8/7/2024 5