State-of-the-art Computer Vision pipelines show poor performances on artworks and data coming from the artistic domain, thus limiting the applicability of current architectures to the automatic understanding of the cultural heritage. This is mainly due to the difference in texture and low-level feature distribution between artistic and real images, on which state-of-the-art approaches are usually trained. To enhance the applicability of pre-trained architectures on artistic data, we have recently proposed an unpaired domain translation approach which can translate artworks to photo-realistic visualizations. Our approach leverages semantically-aware memory banks of real patches, which are used to drive the generation of the translated image while improving its realism. In this paper, we provide additional analyses and experimental results which demonstrate the effectiveness of our approach. In particular, we evaluate the quality of generated results in the case of the translation of landscapes, portraits and of paintings coming from four different styles using automatic distance metrics. Also, we analyze the response of pre-trained architecture for classification, detection and segmentation both in terms of feature distribution and entropy of prediction, and show that our approach effectively reduces the domain shift of paintings. As an additional contribution, we also provide a qualitative analysis of the reduction of the domain shift for detection, segmentation and image captioning.

Image-to-Image Translation to Unfold the Reality of Artworks: an Empirical Analysis / Tomei, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita. - (2019), pp. 741-752. (Intervento presentato al convegno International Conference on Image Analysis and Processing tenutosi a Trento, Italy nel 9-13 September, 2019) [10.1007/978-3-030-30645-8_67].

Image-to-Image Translation to Unfold the Reality of Artworks: an Empirical Analysis

Tomei, Matteo;Cornia, Marcella;Baraldi, Lorenzo;Cucchiara, Rita
2019

Abstract

State-of-the-art Computer Vision pipelines show poor performances on artworks and data coming from the artistic domain, thus limiting the applicability of current architectures to the automatic understanding of the cultural heritage. This is mainly due to the difference in texture and low-level feature distribution between artistic and real images, on which state-of-the-art approaches are usually trained. To enhance the applicability of pre-trained architectures on artistic data, we have recently proposed an unpaired domain translation approach which can translate artworks to photo-realistic visualizations. Our approach leverages semantically-aware memory banks of real patches, which are used to drive the generation of the translated image while improving its realism. In this paper, we provide additional analyses and experimental results which demonstrate the effectiveness of our approach. In particular, we evaluate the quality of generated results in the case of the translation of landscapes, portraits and of paintings coming from four different styles using automatic distance metrics. Also, we analyze the response of pre-trained architecture for classification, detection and segmentation both in terms of feature distribution and entropy of prediction, and show that our approach effectively reduces the domain shift of paintings. As an additional contribution, we also provide a qualitative analysis of the reduction of the domain shift for detection, segmentation and image captioning.
2019
International Conference on Image Analysis and Processing
Trento, Italy
9-13 September, 2019
741
752
Tomei, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita
Image-to-Image Translation to Unfold the Reality of Artworks: an Empirical Analysis / Tomei, Matteo; Cornia, Marcella; Baraldi, Lorenzo; Cucchiara, Rita. - (2019), pp. 741-752. (Intervento presentato al convegno International Conference on Image Analysis and Processing tenutosi a Trento, Italy nel 9-13 September, 2019) [10.1007/978-3-030-30645-8_67].
File in questo prodotto:
File Dimensione Formato  
paper.pdf

Accesso riservato

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 3.14 MB
Formato Adobe PDF
3.14 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1178737
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 3
social impact