Sounding out extra-normal AI voice: Non-normative musical engagements with normative AI voice and speech technologies
Paper in proceeding, 2024

How do we challenge the norms of AI voice technologies? What would be a non-normative approach in finding novel artistic possibilities of speech synthesis and text-to-speech with Deep Learning? This paper delves into SpeechBrain, OpenAI and CoquiTTS voice and speech models with the perspective of an experimental vocal practitioner. An exploratory Research-through-Design process guided an engagement with pre-trained speech synthesis models to reveal their musical affordances in an experimental vocal practice. We recorded this engagement with voice and speech Deep Learning technologies using auto-ethnography, a novel and recent methodology in Human-Computer Interaction. Our position in this paper actively subverts the normative function of these models, provoking nonsensical AI-mediation of human vocality. Emerging from a sense-making process of poetic AI nonsense, we uncover the generative potential of non-normative usage of normative speech recognition and synthesis models. We contribute with insights about the affordances of Research-through-Design to inform artistic processes in working with AI models; how AI-mediations reform understandings of human vocality; and artistic perspectives and practice as knowledge-creation mechanisms for working with technology.

speech recognition

AI research-through-design

musical AI

AI voice

speech synthesis

Author

Kelsey Cotton

Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI

Kivanc Tatar

Chalmers, Computer Science and Engineering (Chalmers), Interaction Design and Software Engineering

AI Music Creativity Proceedings 2024

The International Conference on AI and Musical Creativity
Oxford, ,

Subject Categories

Media and Communication Technology

Arts

Information Science

Human Computer Interaction

Media and Communications

Computer Science

More information

Created

9/18/2024