While several approaches to bring vision and language together are emerging, none of them has yet addressed the digital humanities domain, which, nevertheless, is a rich source of visual and textual data. To foster research in this direction, we investigate the learning of visual-semantic embeddings for historical document illustrations, devising both supervised and semi-supervised approaches. We exploit the joint visual-semantic embeddings to automatically align illustrations and textual elements, thus providing an automatic annotation of the visual content of a manuscript. Experiments are performed on the Borso d'Este Holy Bible, one of the most sophisticated illuminated manuscript from the Renaissance, which we manually annotate aligning every illustration with textual commentaries written by experts. Experimental results quantify the domain shift between ordinary visual-semantic datasets and the proposed one, validate the proposed strategies, and devise future works on the same line.

Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities / Baraldi, Lorenzo; Cornia, Marcella; Grana, Costantino; Cucchiara, Rita. - (2018), pp. 1097-1102. (Intervento presentato al convegno International Conference on Pattern Recognition tenutosi a Beijing, China nel August 20th-24th, 2018) [10.1109/ICPR.2018.8545064].

Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities

Baraldi, Lorenzo;Cornia, Marcella;Grana, Costantino;Cucchiara, Rita
2018

Abstract

While several approaches to bring vision and language together are emerging, none of them has yet addressed the digital humanities domain, which, nevertheless, is a rich source of visual and textual data. To foster research in this direction, we investigate the learning of visual-semantic embeddings for historical document illustrations, devising both supervised and semi-supervised approaches. We exploit the joint visual-semantic embeddings to automatically align illustrations and textual elements, thus providing an automatic annotation of the visual content of a manuscript. Experiments are performed on the Borso d'Este Holy Bible, one of the most sophisticated illuminated manuscript from the Renaissance, which we manually annotate aligning every illustration with textual commentaries written by experts. Experimental results quantify the domain shift between ordinary visual-semantic datasets and the proposed one, validate the proposed strategies, and devise future works on the same line.
2018
International Conference on Pattern Recognition
Beijing, China
August 20th-24th, 2018
1097
1102
Baraldi, Lorenzo; Cornia, Marcella; Grana, Costantino; Cucchiara, Rita
Aligning Text and Document Illustrations: towards Visually Explainable Digital Humanities / Baraldi, Lorenzo; Cornia, Marcella; Grana, Costantino; Cucchiara, Rita. - (2018), pp. 1097-1102. (Intervento presentato al convegno International Conference on Pattern Recognition tenutosi a Beijing, China nel August 20th-24th, 2018) [10.1109/ICPR.2018.8545064].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1159832
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 27
  • ???jsp.display-item.citation.isi??? 20
social impact