Towards Hindi/Urdu FrameNets via the Multilingual FrameNet
Paper in proceeding, 2018

The Multilingual FrameNet Project (MLFN, 2017) is using translations of Ken Robinson’s popular TED talk (Robinson,
2006) to study universal and cross lingual aspects of frame annotation. There are no FrameNets yet for Hindi and Urdu,
but we are annotating the Hindi and Urdu translations of Robinson’s talk using the frames of the English FrameNet.
(Surprisingly, there was no Hindi translation, so we did that ourselves). Preprocessing is needed: the word-segmentation
and POS tagging tools available for Hindi and Urdu were satisfactory, the full-form lexicons less so. The web-based
multi-layer frame annotation tool allows additions to the lexicon, so we simply added each form as a new “word”, our
goal here being only to look at the frames and frame elements—we plan to look at grammatical function and phrase
type later. While some sentences show that the frame analysis of English or Portuguese will not carry over to Hindi or
Urdu for cultural or linguistic reasons, others are harder to be deinite about. Partly, this is because there are so many
possible translations. An expected observation is that a choice of word can steer the focus from one frame to another.
Our annotations will help when we start building framenets for Hindi and Urdu.

FrameNet

Frame semantics

Multilingual FrameNet

Lexico-Semantic Resources

Author

Shafqat Mumtaz Virk

University of Gothenburg

K V S Prasad

Chalmers, Computer Science and Engineering (Chalmers), Functional Programming

Proceedings - International FrameNet Workshop 2018 Multilingual Framenets and Constructicons

69-74
979-10-95546-04-7 (ISBN)

LREC 2018 Workshop International FrameNet Workshop 2018: Multilingual Framenets and Constructions
Miyazaki, Japan,

Areas of Advance

Information and Communication Technology

Subject Categories

Language Technology (Computational Linguistics)

General Language Studies and Linguistics

Specific Languages

More information

Latest update

4/26/2022