Abstract Wikipedia and Vastly Multilingual Natural Language Generation (Keynote Talk)
Other conference contribution, 2021

Abstract Wikipedia is an initiative from the Wikimedia Foundation to generate Wikipedia articles from an abstract (i.e., language-neutral) source in multiple languages. The goal has been set to 20 million articles in over 300 languages, guaranteed to be in synchrony with up-to-date information and thereby with each other. This is by far the largest Natural Language Generation (NLG) project of all times. Grammatical Framework (GF), with 40 languages and specialized domains such as science, law, and e-commerce, is orders of magnitude smaller. Nevertheless, GF has served as inspiration for Abstract Wikipedia, and pilot projects have started to scale it up to the task. Research in both NLG techniques, language resources, processing algorithms, and interaction with human authors is needed. This talk will outline a possible way to build up Abstract Wikipedia by starting with simple text-robot-like techniques and proceeding to more sophisticated NLG.

Author

Aarne Ranta

Computing Science

University of Gothenburg

Proceedings of the Symposium Logic and Algorithms in Computational Linguistics 2021 (LACompLing2021)

Logic and Algorithms in Computational Linguistics (LACompLing 2021)
Virtual, ,

Subject Categories (SSIF 2025)

Natural Language Processing

Computer Sciences

More information

Latest update

11/27/2025