Smart Paradigms and the predictability and complexity of inflectional morphology
Paper i proceeding, 2012

Morphological lexica are often implemented on top of morphological paradigms, corresponding to different ways of building the full inflection table of a word. Computationally precise lexica may use hundreds of paradigms, and it can be hard for a lexicographer to choose among them. To automate this task, this paper introduces the notion of a smart paradigm. It is a metaparadigm, which inspects the base form and tries to infer which low-level paradigm applies. If the result is uncertain, more forms are given for discrimination. The number of forms needed in average is a measure of predictability of an inflection system. The overall complexity of the system also has to take into account the code size of the paradigms definition itself. This paper evaluates the smart paradigms implemented in the open-source GF Resource Grammar Library. Predictability and complexity are estimated for four different languages: English, French, Swedish, and Finnish. The main result is that predictability does not decrease when the complexity of morphology grows, which means that smart paradigms provide an efficient tool for the manual construction and/or automatically bootstrapping of lexica.

linguistics

computational linguistics

natural language

Författare

Gregoire Detrez

Chalmers, Data- och informationsteknik, Datavetenskap

Aarne Ranta

Chalmers, Data- och informationsteknik, Datavetenskap

EACL 2012 - 13th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings

645-653

13th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2012
Avignon, France,

Ämneskategorier

Språkteknologi (språkvetenskaplig databehandling)

Datavetenskap (datalogi)

Datorsystem

Mer information

Senast uppdaterat

2020-01-20