Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain

Stefanini, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Corsini, Massimiliano; Cucchiara, Rita

doi:10.1007/978-3-030-30645-8_66

As vision and language techniques are widely applied to realistic images, there is a growing interest in designing visual-semantic models suitable for more complex and challenging scenarios. In this paper, we address the problem of cross-modal retrieval of images and sentences coming from the artistic domain. To this aim, we collect and manually annotate the Artpedia dataset that contains paintings and textual sentences describing both the visual content of the paintings and other contextual information. Thus, the problem is not only to match images and sentences, but also to identify which sentences actually describe the visual content of a given image. To this end, we devise a visual-semantic model that jointly addresses these two challenges by exploiting the latent alignment between visual and textual chunks. Experimental evaluations, obtained by comparing our model to different baselines, demonstrate the effectiveness of our solution and highlight the challenges of the proposed dataset. The Artpedia dataset is publicly available at: http://aimagelab.ing.unimore.it/artpedia.

Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain / Stefanini, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Corsini, Massimiliano; Cucchiara, Rita. - (2019), pp. 729-740. (Intervento presentato al convegno International Conference on Image Analysis and Processing tenutosi a Trento, Italy nel 9-13 September, 2019) [10.1007/978-3-030-30645-8_66].

Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain

Stefanini, Matteo;Cornia, Marcella;Baraldi, Lorenzo;Corsini, Massimiliano;Cucchiara, Rita

2019

Abstract

As vision and language techniques are widely applied to realistic images, there is a growing interest in designing visual-semantic models suitable for more complex and challenging scenarios. In this paper, we address the problem of cross-modal retrieval of images and sentences coming from the artistic domain. To this aim, we collect and manually annotate the Artpedia dataset that contains paintings and textual sentences describing both the visual content of the paintings and other contextual information. Thus, the problem is not only to match images and sentences, but also to identify which sentences actually describe the visual content of a given image. To this end, we devise a visual-semantic model that jointly addresses these two challenges by exploiting the latent alignment between visual and textual chunks. Experimental evaluations, obtained by comparing our model to different baselines, demonstrate the effectiveness of our solution and highlight the challenges of the proposed dataset. The Artpedia dataset is publicly available at: http://aimagelab.ing.unimore.it/artpedia.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
			2019
		
	Titolo del Convegno
	
			International Conference on Image Analysis and Processing
		
	Luogo del Convegno
	
			Trento, Italy
		
	Data del Convegno
	
			9-13 September, 2019
		
	Codice DOI
	
			https://dx.doi.org/10.1007/978-3-030-30645-8_66
		
	Codice WoS
	
			WOS:000562008400064
		
	Codice Scopus
	
			2-s2.0-85072887457
		
	Serie
	
			LECTURE NOTES IN COMPUTER SCIENCE
		
	Pagina iniziale
	
			729
		
	Pagina finale
	
			740
		
	Tutti gli autori
	
			Stefanini, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Corsini, Massimiliano; Cucchiara, Rita
		
	Citazione
	
			Artpedia: A New Visual-Semantic Dataset with Visual and Contextual Sentences in the Artistic Domain / Stefanini, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Corsini, Massimiliano; Cucchiara, Rita. - (2019), pp. 729-740. (Intervento presentato al  convegno International Conference on Image Analysis and Processing tenutosi a Trento, Italy nel 9-13 September, 2019) [10.1007/978-3-030-30645-8_66].
		
	Tipologia
	
			Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
paper.pdf Open access Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 553.46 kB Formato Adobe PDF Visualizza/Apri	553.46 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris