With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery.

Denoising Diffusion Models on Model-Based Latent Space / Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.. - In: ALGORITHMS. - ISSN 1999-4893. - 16:11(2023), pp. 1-17. [10.3390/a16110501]

Denoising Diffusion Models on Model-Based Latent Space

Scribano C.;Pezzi D.;Franchini G.;Prato M.
2023

Abstract

With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery.
2023
16
11
1
17
Denoising Diffusion Models on Model-Based Latent Space / Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.. - In: ALGORITHMS. - ISSN 1999-4893. - 16:11(2023), pp. 1-17. [10.3390/a16110501]
Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.
File in questo prodotto:
File Dimensione Formato  
algorithms-16-00501-v2.pdf

Open access

Tipologia: Versione pubblicata dall'editore
Dimensione 25.2 MB
Formato Adobe PDF
25.2 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1328629
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact