Denoising Diffusion Models on Model-Based Latent Space

Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.

doi:10.3390/a16110501

With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery.

Denoising Diffusion Models on Model-Based Latent Space / Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.. - In: ALGORITHMS. - ISSN 1999-4893. - 16:11(2023), pp. 1-17. [10.3390/a16110501]

Denoising Diffusion Models on Model-Based Latent Space

Scribano C.;Pezzi D.;Franchini G.;Prato M.

2023

Abstract

With the recent advancements in the field of diffusion generative models, it has been shown that defining the generative process in the latent space of a powerful pretrained autoencoder can offer substantial advantages. This approach, by abstracting away imperceptible image details and introducing substantial spatial compression, renders the learning of the generative process more manageable while significantly reducing computational and memory demands. In this work, we propose to replace autoencoder coding with a model-based coding scheme based on traditional lossy image compression techniques; this choice not only further diminishes computational expenses but also allows us to probe the boundaries of latent-space image generation. Our objectives culminate in the proposal of a valuable approximation for training continuous diffusion models within a discrete space, accompanied by enhancements to the generative model for categorical values. Beyond the good results obtained for the problem at hand, we believe that the proposed work holds promise for enhancing the adaptability of generative diffusion models across diverse data types beyond the realm of imagery.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Rivista
	
				ALGORITHMS
			
	N° del Volume
	
				16
			
	Fascicolo
	
				11
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				17
			
	Codice DOI
	
				https://dx.doi.org/10.3390/a16110501
			
	Codice WoS
	
				WOS:001107916200001
			
	Codice Scopus
	
				2-s2.0-85178299211
			
	Citazione
	
				Denoising Diffusion Models on Model-Based Latent Space / Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.. - In: ALGORITHMS. - ISSN 1999-4893. - 16:11(2023), pp. 1-17. [10.3390/a16110501]
			
	Tutti gli autori
	
						Scribano, C.; Pezzi, D.; Franchini, G.; Prato, M.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
algorithms-16-00501-v2.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] creative-commons Dimensione 25.2 MB Formato Adobe PDF Visualizza/Apri	25.2 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris