Part-of-Speech Taggers Make Errors on Unambiguous Sentences

Minerva Suvanto; Mattias Wahde; Marco L. Della Vedova

doi:10.1007/978-3-031-87327-0_10

Part-of-Speech Taggers Make Errors on Unambiguous Sentences
Paper i proceeding, 2025

We show that commonly used part-of-speech (POS) taggers, despite their high reported performance, in many cases make tagging errors on simple and unambiguous sentences. We collect a new data set of non-ambiguous sentences that can easily be tagged by human taggers, but where at least one standard POS tagger makes precisely one tag-ging error. Furthermore, we present a method for generating rules that are meant to correct the output of a standard POS tagger. Applying this method to the new data set, we extract a set of such rules, which are then evaluated over another data set introduced in earlier work. Our results show that the method works, but also that the increase in tagging accuracy is rather small, probably due to the small size of our training data set. Finally, we present an analysis of POS tagging in general, con-cluding that there are multiple ambiguities that introduce unresolvable challenges in POS tagging.

Part-of-speech tagging

Natural language processing

Rule-based systems

Författare

Minerva Suvanto

Chalmers, Mekanik och maritima vetenskaper, Fordonsteknik och autonoma system

Forskning Andra publikationer

Mattias Wahde

Chalmers, Mekanik och maritima vetenskaper, Fordonsteknik och autonoma system

Forskning Andra publikationer

Marco L. Della Vedova

Chalmers, Mekanik och maritima vetenskaper, Fordonsteknik och autonoma system

Forskning Andra publikationer

Lecture Notes in Computer Science

0302-9743 (ISSN) 1611-3349 (eISSN)

Vol. 15591 LNAI 207-221
9783031873263 (ISBN)

16th International Conference on Agents and Artificial Intelligence, ICAART 2024
Rome, Italy,

Ämneskategorier (SSIF 2025)

Språkbehandling och datorlingvistik

Datavetenskap (datalogi)

DOI

10.1007/978-3-031-87327-0_10

Publikationsdata kopplat till DOI

Mer information

Senast uppdaterat

2025-06-27

Part-of-Speech Taggers Make Errors on Unambiguous Sentences Paper i proceeding, 2025

Författare

Minerva Suvanto

Mattias Wahde

Marco L. Della Vedova

Lecture Notes in Computer Science

Ämneskategorier (SSIF 2025)

DOI

Mer information

Senast uppdaterat

Part-of-Speech Taggers Make Errors on Unambiguous Sentences
Paper i proceeding, 2025