Word Representations for Emergent Communication and Natural Language Processing
Doctoral thesis, 2018

The task of listing all semantic properties of a single word might seem manageable at first but as you unravel all the context dependent subtle variations in meaning that a word can encompass, you soon realize that precise mathematical definition of a word’s semantics is extremely difficult. In analogy, humans have no problem identifying their favorite pet in an image but the task of precisely defining how, is still beyond our capabilities. A solution that has proved effective in the visual domain is to solve the problem by learning abstract representations using machine learning. Inspired by the success of learned representations in computer vision, the line of work presented in this thesis will explore learned word representations in three different contexts.

Starting in the domain of artificial languages, three computational frameworks for emergent communication between collaborating agents are developed in an attempt to study word representations that exhibit grounding of concepts. The first two are designed to emulate the natural development of discrete color words using deep reinforcement learning, and used to simulate the emergence of color terms that partition the continuous color spectra of visual light. The properties of the emerged color communication schema is compared to human languages to ensure its validity as a cognitive model, and subsequently the frameworks are utilized to explore central questions in cognitive science about universals in language within the semantic domain of color. Moving beyond the color domain, a third framework is developed for the less controlled environment of human faces and multi-step communication. Subsequently, as for the color domain we carefully analyze the semantic properties of the words emerged between the agents but in this case focusing on the grounding.

Turning the attention to the empirical usefulness, different types of learned word representations are evaluated in the context of automatic document summarisation, word sense disambiguation, and word sense induction with results that show great potential for learned word representations in natural language processing by reaching state-of-the-art performance in all applications and outperforming previous methods in two out of three applications.

Finally, although learned word representations seem to improve the performance of real world systems, they do also lack in interpretability when compared to classical hand-engineered representations. Acknowledging this, an effort is made towards construct- ing learned representations that regain some of that interpretability by designing and evaluating disentangled representations, which could be used to represent words in a more interpretable way in the future.

Extractive summarisation

Emergent communication

Deep reinforcement learning

Natural language processing

Word Representations

Deep learning

Machine learning

Artificial neural networks

Room EB in the EDIT building, Hörsalsvägen 11
Opponent: Prof. Anders Søgaard, Department of Computer Science, University of Copenhagen, Denmark


Mikael Kågebäck

Chalmers, Computer Science and Engineering (Chalmers), Data Science

Kågebäck, M., Dubhashi, D., Sayeed, A. A reinforcement-learning approach to efficient communication

DeepColor: Reinforcement Learning optimizes information efficiency and well-formedness in color name partitioning

Proceedings of the 40th Annual Meeting of the Cognitive Science Society (CogSci),; (2018)p. 1895-1900

Paper in proceeding

Jorge, E., Kågebäck, M., Johansson, F. D., Gustavsson, E. Learning to Play Guess Who? and Inventing a Grounded Language as a Consequence

Extractive Summarization using Continuous Vector Space Models

Proceedings of the 2nd Workshop on Continuous Vector Space Models and their Compositionality (CVSC) EACL, April 26-30, 2014 Gothenburg, Sweden,; (2014)p. 31-39

Paper in proceeding

Extractive summarization by aggregating multiple similarities

International Conference Recent Advances in Natural Language Processing, RANLP,; Vol. 2015(2015)p. 451-457

Paper in proceeding

Word Sense Disambiguation using a Bidirectional LSTM

5th Workshop on Cognitive Aspects of the Lexicon (CogALex-V) at the 26th International Conference on Computational Linguistics (COLING 2016),; (2016)

Paper in proceeding

Neural context embeddings for automatic discovery of word senses

Proceedings of the 1st Workshop on Vector Space Modeling for Natural Language Processing. Denver, United States,; (2015)p. 25-32

Paper in proceeding

Kågebäck, M., Mogren, O. Disentangled activations in deep networks

Speakers of a language tend to have a very personal relationship to the words that make up that language. Every word has a different feel to it, that somehow encapsulates the essence of what that word means, but how do you translate this feeling into a mathematical representation that can be used in computation?

In this thesis learned representations for words are studied in three different contexts, starting in the domain of emerged artificial languages, where agents learn to communicate by inventing their own language. Subsequently, for encoding the input to different natural language processing systems, e.g. automatic text summarisation, where they show great potential. Finally, the interpretability of learned representations is addressed in an effort to gain improved control over the information encoded in learned representations.

Towards a knowledge-based culturomics

Swedish Research Council (VR) (2012-5738), 2012-01-01 -- 2016-12-31.

Subject Categories

Language Technology (Computational Linguistics)

Computer Science

Computer Vision and Robotics (Autonomous Systems)



Doktorsavhandlingar vid Chalmers tekniska högskola. Ny serie: 4512



Room EB in the EDIT building, Hörsalsvägen 11

Opponent: Prof. Anders Søgaard, Department of Computer Science, University of Copenhagen, Denmark

More information

Latest update