Fine-grained Entailment: Resources for Greek NLI and Precise Entailment
Paper i proceeding, 2022

In this paper, we present a number of fine-grained resources for Natural Language Inference (NLI). In particular, we present a number of resources and validation methods for Greek NLI and a resource for precise NLI. First, we extend the Greek version of the FraCaS test suite to include examples where the inference is directly linked to the syntactic/morphological properties of Greek. The new resource contains an additional 428 examples, making it in total a dataset of 774 examples. Expert annotators have been used in order to create the additional resource, while extensive validation of the original Greek version of the FraCaS by non-expert and expert subjects is performed. Next, we continue the work initiated by (CITATION), according to which a subset of the RTE problems have been labeled for missing hypotheses and we present a dataset an order of magnitude larger, annotating the whole SuperGlUE/RTE dataset with missing hypotheses. Lastly, we provide a de-dropped version of the Greek XNLI dataset, where the pronouns that are missing due to the pro-drop nature of the language are inserted. We then run some models to see the effect of that insertion and report the results.

Författare

Erini Amanaki

Panepistimio Kritis

Jean-Philippe Bernardy

Göteborgs universitet

Chalmers, Data- och informationsteknik, Computing Science

Stergios Chatzikyriakidis

Panepistimio Kritis

Robin Cooper

Göteborgs universitet

Simon Dobnik

Göteborgs universitet

Aram Karimi

Göteborgs universitet

Adam Ek

Göteborgs universitet

Eirini Chrysovalantou Giannikouri

Panepistimio Kritis

Vasiliki Katsouli

Panepistimio Kritis

Ilias Kolokousis

Panepistimio Kritis

Eirini Chrysovalantou Mamatzaki

Panepistimio Kritis

Dimitrios Papadakis

Panepistimio Kritis

Olga Petrova

Panepistimio Kritis

Erofili Psaltaki

Panepistimio Kritis

Charikleia Soupiona

Panepistimio Kritis

Effrosyni Skoulataki

Panepistimio Kritis

Christina Stefanidou

Panepistimio Kritis

Proceedings of the Workshop on Dataset Creation for Lower-Resourced Languages

44-52
978-2-493814-06-7 (ISBN)

Proceedings of the Workshop on Dataset Creation for Lower-Resourced Languages
Marseilles, France,

Ämneskategorier (SSIF 2025)

Språkbehandling och datorlingvistik

Psykologi

Jämförande språkvetenskap och allmän lingvistik

Mer information

Senast uppdaterat

2025-06-27