Addressing Degeneracies in Latent Interpolation for Diffusion Models
Paper i proceeding, 2025

There is an increasing interest in using image-generating diffusion models for deep data augmentation and image morphing. In this context, it is useful to interpolate between latents produced by inverting a set of input images, in order to generate new images representing some mixture of the inputs. We observe that such interpolation can easily lead to degenerate results when the number of inputs is large. We analyze the cause of this effect theoretically and experimentally, and suggest a suitable remedy. The suggested approach is a relatively simple normalization scheme that is easy to use whenever interpolation between latents is needed. We measure image quality using FID and CLIP embedding distance and show experimentally that baseline interpolation methods lead to a drop in quality metrics long before the degeneration issue is clearly visible. In contrast, our method significantly reduces the degeneration effect and leads to improved quality metrics also in non-degenerate situations.

Diffusion models

Image interpolation

Text-to-image models

Författare

Erik Landolsi

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Fredrik Kahl

Chalmers, Elektroteknik, Signalbehandling och medicinsk teknik

Lecture Notes in Computer Science

0302-9743 (ISSN) 1611-3349 (eISSN)

Vol. 15725 LNCS 16-29
9783031959103 (ISBN)

23rd Scandinavian Conference on Image Analysis, SCIA 2025
Reykjavik, Iceland,

Ämneskategorier (SSIF 2025)

Datorgrafik och datorseende

DOI

10.1007/978-3-031-95911-0_2

Mer information

Senast uppdaterat

2025-07-16