glemöhnic
Övrigt konferensbidrag, 2024

glemöhnic is a ~20 minute performance piece that utilises SpeechBrain, OpenAI and CoquiTTS voice and speech models to transcribe, synthesise and clone historical and real-time improvised vocal gestures.

This piece explores how extra-normal [undefined] vocal sounds trigger and provoke nonsensical AI-mediations of human vocality, by utilising non-text audio as input material for text-expectant AI speech recognition and synthesis models. The mediation of non-textual human voice gestures by these ASR models yields eclectic, bizarre and poetic nonsense, which is further utilised as textual input for text-to-speech synthesis and voice cloning models. CoquiTTS’ XTTS_V2 model re-constructs the syllabic, phonemic and garbled poems into vocal clones that oscillate between their reference audio (the original audio dataset input) and the scraped audio data that the XTTS_V2 model has been trained on. The result of this is a collection of original and cloned audio samples that are utilised as sonic material in a live coded musical performance, using the strudelREPL platform.

research-through-design

AI

voice synthesis

musical AI

speech recognition

Författare

Kelsey Cotton

Chalmers, Data- och informationsteknik, Data Science och AI

International Conference on AI and Musical Creativity

The International Conference on AI and Musical Creativity
Oxford, United Kingdom,

Ämneskategorier (SSIF 2011)

Scenkonst

Musik

Datavetenskap (datalogi)

Infrastruktur

C3SE (Chalmers Centre for Computational Science and Engineering)

Mer information

Senast uppdaterat

2024-10-30