Latent Timbre Synthesis: Audio-based Variational Auto-Encoders for Music Composition Applications

Kivanc Tatar; Daniel Bisig; Philippe Pasquier

doi:10.1007/s00521-020-05424-2

Latent Timbre Synthesis: Audio-based Variational Auto-Encoders for Music Composition Applications
Journal article, 2020

We present the Latent Timbre Synthesis, a new audio synthesis method using deep learning. The synthesis method allows composers and sound designers to interpolate and extrapolate between the timbre of multiple sounds using the latent space of audio frames. We provide the details of two Variational Autoencoder architectures for the Latent Timbre Synthesis and compare their advantages and drawbacks. The implementation includes a fully working application with a graphical user interface, called interpolate_two, which enables practitioners to generate timbres between two audio excerpts of their selection using interpolation and extrapolation in the latent space of audio frames. Our implementation is open source, and we aim to improve the accessibility of this technology by providing a guide for users with any technical background. Our study includes a qualitative analysis where nine composers evaluated the Latent Timbre Synthesis and the interpolate_two application within their practices.

Author

Kivanc Tatar

Simon Fraser University

Other publications Research

Daniel Bisig

Zurich University of the Arts

Philippe Pasquier

Simon Fraser University

Neural Computing and Applications

0941-0643 (ISSN) 1433-3058 (eISSN)

Vol. 33 The Special Issue of Neural Computing and Applications: “Networks in Art, Sound and Design.” 67-84

Subject Categories (SSIF 2011)

Media and Communication Technology

Information Systemes, Social aspects

Computer Science

DOI

10.1007/s00521-020-05424-2

Publication data connected to DOI

More information

Latest update

10/6/2023

Latent Timbre Synthesis: Audio-based Variational Auto-Encoders for Music Composition Applications Journal article, 2020