Developing an interlingual translation lexicon using WordNets and Grammatical Framework
Paper i proceeding, 2014
The Grammatical Framework (GF) offers perfect translation between controlled subsets
of natural languages. E.g., an abstract syntax for a set of sentences in school mathematics
is the interlingua between the corresponding sentences in English and Hindi, say. GF
“resource grammars” specify how to say something in English or Hindi; these are reused
with “application grammars” that specify what can be said (mathematics, tourist
phrases, etc.). More recent robust parsing and parse-tree disambiguation allow GF to
parse arbitrary English text. We report here an experiment to linearise the resulting
tree directly to other languages (e.g. Hindi, German, etc.), i.e., we use a language independent
resource grammar as the interlingua. We focus particularly on the last part
of the translation system, the interlingual lexicon and word sense disambiguation (WSD).
We improved the quality of the wide coverage interlingual translation lexicon by using
the Princeton and Universal WordNet data. We then integrated an existing WSD tool
and replaced the usual GF style lexicons, which give one target word per source word,
by the WordNet based lexicons. These new lexicons and WSD improve the quality of
translation in most cases, as we show by examples. Both WordNets and WSD in general
are well known, but this is the first use of these tools with GF.
WordNet
word sense disambiguation
Grammatical Framework
lexicons