GF Runtime System
Licentiate thesis, 2009
Natural languages have been subject of studies for centuries
and are hot topic even today. The demand for computer systems able to communicate directly in natural language places new challenges. Computational resources like grammars and lexicons and efficient processing tools are needed.
Grammars are described as computer programs in declarative
domain specific languages. Just like any other programming language they require mature compilers and efficient runtime systems.
The topic of this thesis is the runtime system for the
Grammatical Framework (GF) language. The first part of the thesis describes the semantics of the Portable Grammar Format (PGF). This is a low-level format which for GF plays the same role as the JVM bytecode for Java. The representation is designed to be as simple as possible to make it easy to write interpreters for it. The second part is for the incremetal parsing algorithm in PGF. The parser performance was always a bottle neck in GF until now. The new parser is of orders of magnitude faster than any of the previous implementations.
The last contribution of the thesis is the development of the Bulgarian resource grammar. This is the first Balkan language in the resource library and the first open-source grammar for Bulgarian. The grammar development was also an important benchmark for the development of the parsing algorithm.