Towards an Algebraic Approach for Corpus Queries
Other conference contribution, 2024

Analysis of text corpora involves the use of specialised corpus search tools, capable of handling huge amounts of annotated text. The extent to which these tools apply optimisations to reduce query execution times is as diverse as the tools themselves. We argue that the development of a corpus algebra, similar to relational algebra in relational database systems, is a valuable foundation to improve corpus query optimisation. We demonstrate a query optimisation approach based on algebraic transformations, which vastly reduces query execution times.

Author

Niklas Deworetzki

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Functional Programming

Peter Ljunglöf

University of Gothenburg

Chalmers, Computer Science and Engineering (Chalmers), Functional Programming

Nicholas Smallbone

Chalmers, Computer Science and Engineering (Chalmers), Functional Programming

University of Gothenburg

Swedish Language Technology Conference
, ,

Subject Categories (SSIF 2025)

Natural Language Processing

Computer Sciences

More information

Latest update

9/8/2025 5