VATr++: Choose Your Words Wisely for Handwritten Text Generation

Vanherle, B.; Pippi, V.; Cascianelli, S.; Michiels, N.; Van Reeth, F.; Cucchiara, R.

doi:10.1109/TPAMI.2024.3481154

Styled Handwritten Text Generation (HTG) has received significant attention in recent years,propelled by the success of learning-based solutions employing GANs,Transformers,and,preliminarily,Diffusion Models. Despite this surge in interest,there remains a critical yet understudied aspect - the impact of the input,both visual and textual,on the HTG model training and its subsequent influence on performance. This work extends the VATr [1] Styled-HTG approach by addressing the pre-processing and training issues that it faces,which are common to many HTG models. In particular,we propose generally applicable strategies for input preparation and training regularization that allow the model to achieve better performance and generalization capabilities. Moreover,in this work,we go beyond performance optimization and address a significant hurdle in HTG research - the lack of a standardized evaluation protocol. In particular,we propose a standardization of the evaluation protocol for HTG and conduct a comprehensive benchmarking of existing approaches. By doing so,we aim to establish a foundation for fair and meaningful comparisons between HTG strategies,fostering progress in the field.

VATr++: Choose Your Words Wisely for Handwritten Text Generation / Vanherle, B.; Pippi, V.; Cascianelli, S.; Michiels, N.; Van Reeth, F.; Cucchiara, R.. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - PP:(2024), pp. 1-15. [10.1109/TPAMI.2024.3481154]

VATr++: Choose Your Words Wisely for Handwritten Text Generation

Vanherle B.;Pippi V.;Cascianelli S.;Michiels N.;Van Reeth F.;Cucchiara R.

2024

Abstract

Styled Handwritten Text Generation (HTG) has received significant attention in recent years,propelled by the success of learning-based solutions employing GANs,Transformers,and,preliminarily,Diffusion Models. Despite this surge in interest,there remains a critical yet understudied aspect - the impact of the input,both visual and textual,on the HTG model training and its subsequent influence on performance. This work extends the VATr [1] Styled-HTG approach by addressing the pre-processing and training issues that it faces,which are common to many HTG models. In particular,we propose generally applicable strategies for input preparation and training regularization that allow the model to achieve better performance and generalization capabilities. Moreover,in this work,we go beyond performance optimization and address a significant hurdle in HTG research - the lack of a standardized evaluation protocol. In particular,we propose a standardization of the evaluation protocol for HTG and conduct a comprehensive benchmarking of existing approaches. By doing so,we aim to establish a foundation for fair and meaningful comparisons between HTG strategies,fostering progress in the field.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Rivista
	
				IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
			
	N° del Volume
	
				PP
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				15
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TPAMI.2024.3481154
			
	Codice Scopus
	
				2-s2.0-85207734330
			
	Codice PubMed
	
				39405139
			
	Citazione
	
				VATr++: Choose Your Words Wisely for Handwritten Text Generation / Vanherle, B.; Pippi, V.; Cascianelli, S.; Michiels, N.; Van Reeth, F.; Cucchiara, R.. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - PP:(2024), pp. 1-15. [10.1109/TPAMI.2024.3481154]
			
	Tutti gli autori
	
						Vanherle, B.; Pippi, V.; Cascianelli, S.; Michiels, N.; Van Reeth, F.; Cucchiara, R.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris