Musical AI Voices: Facts, Concerns and Experimental Musical Practices with AI Voice Tools
Licentiate thesis, 2024
Within this topic, we address the technological facts and the societal implications and concerns of generative AI voice tools--encompassing deep learning voice models and speech toolkits--which offer unique artistic potentials to work with a feasibly unending palette of generated vocal sounds. The fidelity of these tools are continually advancing, and are increasingly being utilised within artistic practices.
Our motivations in this thesis are grounded in an exploration of AI voice tools' potentials and problematics, which span a range of both pragmatic technology facts and societal concerns, and navigate interdisciplinary boundaries. The nature and pace of deep learning developments is such that we presently lack methods of visibilising and critiquing the potentials and problematics of AI voice tools used in musical contexts. Further, we are in a unfolding period of investigating the wider socio-technical implications that are constructed through these potentials and problematics within musical practice. This thesis therefore explores the following research questions:
What methodologies assist in visibilising the multifaceted potentials and/or problematics of AI voice tools used in musical contexts?; What wider socio-technical implications occur through these potentials and problematics within musical practice?; and What shifts occur within an experimental musical practice when critically exploring AI voice and speech tools?
Seeking to answer these questions, this thesis develops methodologies for--and chronicles--interdisciplinary practical and theoretical engagements with AI voice and speech models. Further, it discusses and formulates practical methods for feminist and interventionist analysis, and the development of--and performance with--AI voice and speech tools in experimental musical settings. Questions on how to visibilise the potentials and problematics of AI voice tools in musical contexts are foregrounded, alongside explorations into the shifts that occur within experimental musical practices when engaging with such tools.
This thesis contributes with: 1) a novel analytical method for the critical analysis of artworks featuring musical AI voice tools; 2) the establishment of interdisciplinary perspectives as integral to understanding the use, cultures-of-use and implications of voice and speech AI tools in musical applications; 3) a Research-through-Design account of developing and performing with a series of AI voice models in a live music performance; and 4) a research stance on experimental musical practices as enabling the formation of new understandings of human and AI-mediated human vocality.
musical AI
musical AI performance
voice
AI vocality
Author
Kelsey Cotton
Chalmers, Computer Science and Engineering (Chalmers), Data Science and AI
A Shift in Artistic Practices through Artificial Intelligence
Leonardo,;Vol. 57(2024)p. 293-297
Journal article
Caring Trouble and Musical AI: Considerations towards a Feminist Musical AI
AI Music Creativity Proceedings 2023,;(2023)
Paper in proceeding
Sounding out extra-normal AI voice: Non-normative musical engagements with normative AI voice and speech technologies
AI Music Creativity Proceedings 2024,;(2024)
Paper in proceeding
Singing for the Missing: Bringing the Body Back to AI Voice and Speech Technologies
9th International Conference on Movement and Computing,;(2024)
Paper in proceeding
glemöhnic
International Conference on AI and Musical Creativity,;(2024)
Other conference contribution
Subject Categories
Media and Communication Technology
Arts
Human Computer Interaction
Computer Science
Infrastructure
C3SE (Chalmers Centre for Computational Science and Engineering)
Publisher
Chalmers
E2 Room 3364 EDIT-rummet
Opponent: Professor Alexander Refsum Jensenius, RITMO Centre for Interdisciplinary Studies in Rhythm, Time and Motion (IMV), University of Oslo, Norway,
Related datasets
Supplementary Materials for Kelsey Cotton Licentiate Thesis, 2024 [dataset]
DOI: https://doi.org/10.5281/zenodo.13990776