Sounding out extra-normal AI voice: Non-normative musical engagements with normative AI voice and speech technologies
Paper i proceeding, 2024

How do we challenge the norms of AI voice technologies? What would be a non-normative approach in finding novel artistic possibilities of speech synthesis and text-to-speech with Deep Learning? This paper delves into SpeechBrain, OpenAI and CoquiTTS voice and speech models with the perspective of an experimental vocal practitioner. An exploratory Research-through-Design process guided an engagement with pre-trained speech synthesis models to reveal their musical affordances in an experimental vocal practice. We recorded this engagement with voice and speech Deep Learning technologies using auto-ethnography, a novel and recent methodology in Human-Computer Interaction. Our position in this paper actively subverts the normative function of these models, provoking nonsensical AI-mediation of human vocality. Emerging from a sense-making process of poetic AI nonsense, we uncover the generative potential of non-normative usage of normative speech recognition and synthesis models. We contribute with insights about the affordances of Research-through-Design to inform artistic processes in working with AI models; how AI-mediations reform understandings of human vocality; and artistic perspectives and practice as knowledge-creation mechanisms for working with technology.

speech recognition

AI research-through-design

musical AI

AI voice

speech synthesis

Författare

Kelsey Cotton

Chalmers, Data- och informationsteknik, Data Science och AI

Kivanc Tatar

Chalmers, Data- och informationsteknik, Interaktionsdesign och Software Engineering

AI Music Creativity Proceedings 2024

The International Conference on AI and Musical Creativity
Oxford, ,

Ämneskategorier

Medieteknik

Konst

Systemvetenskap

Människa-datorinteraktion (interaktionsdesign)

Medie- och kommunikationsvetenskap

Datavetenskap (datalogi)

Mer information

Skapat

2024-09-18