Efficient corpus search using unary and binary indexes
Paper i proceeding, 2022

We investigate how disk-based inverted indexes can be used for efficient searching in large annotated corpora. We give a formal semantics for simple corpus queries, and show how they can be translated into lookups in unary and binary indexes.

Författare

Peter Ljunglöf

Chalmers, Data- och informationsteknik, Funktionell programmering

Göteborgs universitet

Nicholas Smallbone

Göteborgs universitet

Chalmers, Data- och informationsteknik, Funktionell programmering

Mijo Thoresson

Student vid Chalmers

Victor Salomonsson

Student vid Chalmers

20th Conference on Natural Language Processing, KONVENS 2024 - Proceedings of the Conference

149-158

20th Conference on Natural Language Processing (KONVENS 2024)
Wien, Austria,

Ämneskategorier (SSIF 2025)

Språkbehandling och datorlingvistik

Mer information

Senast uppdaterat

2025-06-27