Towards Hindi/Urdu FrameNets via the Multilingual FrameNet

Shafqat Mumtaz Virk; K V S Prasad

Towards Hindi/Urdu FrameNets via the Multilingual FrameNet
Paper i proceeding, 2018

The Multilingual FrameNet Project (MLFN, 2017) is using translations of Ken Robinson’s popular TED talk (Robinson,
2006) to study universal and cross lingual aspects of frame annotation. There are no FrameNets yet for Hindi and Urdu,
but we are annotating the Hindi and Urdu translations of Robinson’s talk using the frames of the English FrameNet.
(Surprisingly, there was no Hindi translation, so we did that ourselves). Preprocessing is needed: the word-segmentation
and POS tagging tools available for Hindi and Urdu were satisfactory, the full-form lexicons less so. The web-based
multi-layer frame annotation tool allows additions to the lexicon, so we simply added each form as a new “word”, our
goal here being only to look at the frames and frame elements—we plan to look at grammatical function and phrase
type later. While some sentences show that the frame analysis of English or Portuguese will not carry over to Hindi or
Urdu for cultural or linguistic reasons, others are harder to be deinite about. Partly, this is because there are so many
possible translations. An expected observation is that a choice of word can steer the focus from one frame to another.
Our annotations will help when we start building framenets for Hindi and Urdu.

FrameNet

Frame semantics

Multilingual FrameNet

Lexico-Semantic Resources

Författare

Shafqat Mumtaz Virk

Göteborgs universitet

Forskning Andra publikationer

K V S Prasad

Chalmers, Data- och informationsteknik, Funktionell programmering

Forskning Andra publikationer

Proceedings - International FrameNet Workshop 2018 Multilingual Framenets and Constructicons

69-74
979-10-95546-04-7 (ISBN)

LREC 2018 Workshop International FrameNet Workshop 2018: Multilingual Framenets and Constructions
Miyazaki, Japan,

Styrkeområden

Informations- och kommunikationsteknik

Ämneskategorier (SSIF 2011)

Språkteknologi (språkvetenskaplig databehandling)

Jämförande språkvetenskap och allmän lingvistik

Studier av enskilda språk

Mer information

Senast uppdaterat

2022-04-26

Towards Hindi/Urdu FrameNets via the Multilingual FrameNet Paper i proceeding, 2018