Lexical and Grammar Resource Engineering for Runyankore & Rukiga: A Symbolic Approach
Licentiate thesis, 2021

Current research in computational linguistics and natural language processing (NLP) requires the existence of language resources. Whereas these resources are available for a few well-resourced languages, there are many languages that have been neglected. Among the neglected and / or under-resourced languages are Runyankore and Rukiga (henceforth referred to as Ry/Rk). Recently, the NLP community has started to acknowledge that resources for under-resourced languages should also be given priority. Why? One reason being that as far as language typology is concerned, the few well-resourced languages do not represent the structural diversity of the remaining languages.

The central focus of this thesis is about enabling the computational analysis and generation of utterances in Ry/Rk. Ry/Rk are two closely related languages spoken by about 3.4 and 2.4 million people respectively. They belong to the Nyoro-Ganda (JE10) language zone of the Great Lakes, Narrow Bantu of the Niger-Congo language family.

The computational processing of these languages is achieved by formalising the grammars of these two languages using Grammatical Framework (GF) and its Resource Grammar Library (RGL). In addition to the grammar, a general-purpose computational lexicon for the two languages is developed. Although we utilise the lexicon to tremendously increase the lexical coverage of the grammars, the lexicon can be used for other NLP tasks.

In this thesis a symbolic / rule-based approach is taken because the lack of adequate languages resources makes the use of data-driven NLP approaches unsuitable for these languages.

Computational Grammar

Runyankore

Grammar Resource

Grammatical Framework

Lexical Resource

Computational lexicon

Rukiga

Bantu Languages

Runyakitara

Resource Grammar Library

Language Resources

Grammar Engineering

CSE EDIT 8103 and online via Zoom
Opponent: Dr. Wanjiku Ng'ang'a, School of Computing and Informatics, University of Nairobi, Kenya

Author

David Bamutura

Chalmers, Computer Science and Engineering (Chalmers), Functional Programming

Towards computational resource grammars for runyankore and rukiga

LREC 2020 - 12th International Conference on Language Resources and Evaluation, Conference Proceedings,;(2020)p. 2846-2854

Paper in proceeding

Bamutura, Sabiiti David 2021 "Ry/Rk-Lex: A computational lexicon for Runyankore and Rukiga languages" Accepted to the Northern European Association for Language Technology post-proceeding series of the Swedish Language Technology Conference (SLTC 2020)

Areas of Advance

Information and Communication Technology

Driving Forces

Sustainable development

Innovation and entrepreneurship

Subject Categories

Language Technology (Computational Linguistics)

General Language Studies and Linguistics

Specific Languages

Publisher

Chalmers

CSE EDIT 8103 and online via Zoom

Online

Opponent: Dr. Wanjiku Ng'ang'a, School of Computing and Informatics, University of Nairobi, Kenya

More information

Latest update

6/4/2021 3