glemöhnic

Kelsey Cotton

glemöhnic
Other conference contribution, 2024

glemöhnic is a ~20 minute performance piece that utilises SpeechBrain, OpenAI and CoquiTTS voice and speech models to transcribe, synthesise and clone historical and real-time improvised vocal gestures.

This piece explores how extra-normal [undefined] vocal sounds trigger and provoke nonsensical AI-mediations of human vocality, by utilising non-text audio as input material for text-expectant AI speech recognition and synthesis models. The mediation of non-textual human voice gestures by these ASR models yields eclectic, bizarre and poetic nonsense, which is further utilised as textual input for text-to-speech synthesis and voice cloning models. CoquiTTS’ XTTS_V2 model re-constructs the syllabic, phonemic and garbled poems into vocal clones that oscillate between their reference audio (the original audio dataset input) and the scraped audio data that the XTTS_V2 model has been trained on. The result of this is a collection of original and cloned audio samples that are utilised as sonic material in a live coded musical performance, using the strudelREPL platform.

research-through-design

voice synthesis

musical AI

speech recognition

Author

Kelsey Cotton

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Other publications Research

International Conference on AI and Musical Creativity

The International Conference on AI and Musical Creativity
Oxford, United Kingdom,

Subject Categories (SSIF 2011)

Performing Arts

Music

Computer Science

Infrastructure

C3SE (Chalmers Centre for Computational Science and Engineering)

More information

Latest update

10/30/2024

glemöhnic Other conference contribution, 2024