Towards Hindi/Urdu FrameNets via the Multilingual FrameNet
Paper in proceeding, 2018
2006) to study universal and cross lingual aspects of frame annotation. There are no FrameNets yet for Hindi and Urdu,
but we are annotating the Hindi and Urdu translations of Robinson’s talk using the frames of the English FrameNet.
(Surprisingly, there was no Hindi translation, so we did that ourselves). Preprocessing is needed: the word-segmentation
and POS tagging tools available for Hindi and Urdu were satisfactory, the full-form lexicons less so. The web-based
multi-layer frame annotation tool allows additions to the lexicon, so we simply added each form as a new “word”, our
goal here being only to look at the frames and frame elements—we plan to look at grammatical function and phrase
type later. While some sentences show that the frame analysis of English or Portuguese will not carry over to Hindi or
Urdu for cultural or linguistic reasons, others are harder to be deinite about. Partly, this is because there are so many
possible translations. An expected observation is that a choice of word can steer the focus from one frame to another.
Our annotations will help when we start building framenets for Hindi and Urdu.
Shafqat Mumtaz Virk
University of Gothenburg
K V S Prasad
Chalmers, Computer Science and Engineering (Chalmers), Functional Programming
Proceedings - International FrameNet Workshop 2018 Multilingual Framenets and Constructicons
Areas of Advance
Information and Communication Technology
Language Technology (Computational Linguistics)
General Language Studies and Linguistics