Image and video captioning are important tasks in visual data analytics, as they concern the capability of describing visual content in natural language. They are the pillars of query answering systems, improve indexing and search and allow a natural form of human-machine interaction. Even though promising deep learning strategies are becoming popular, the heterogeneity of large image archives makes this task still far from being solved. In this paper we explore how visual saliency prediction can support image captioning. Recently, some forms of unsupervised machine attention mechanisms have been spreading, but the role of human attention prediction has never been examined extensively for captioning. We propose a machine attention model driven by saliency prediction to provide captions in images, which can be exploited for many services on cloud and on multimedia data. Experimental evaluations are conducted on the SALICON dataset, which provides groundtruths for both saliency and captioning, and on the large Microsoft COCO dataset, the most widely used for image captioning.
Visual Saliency for Image Captioning in New Multimedia Services / Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita. - (2017), pp. 309-314. ((Intervento presentato al convegno 2017 IEEE International Conference on Multimedia and Expo Workshops tenutosi a Hong Kong nel July 10-14, 2017.
Data di pubblicazione: | 2017 |
Titolo: | Visual Saliency for Image Captioning in New Multimedia Services |
Autore/i: | Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita |
Autore/i UNIMORE: | |
Digital Object Identifier (DOI): | http://dx.doi.org/10.1109/ICMEW.2017.8026277 |
Codice identificativo Scopus: | 2-s2.0-85031674363 |
Codice identificativo ISI: | WOS:000427041800049 |
Nome del convegno: | 2017 IEEE International Conference on Multimedia and Expo Workshops |
Luogo del convegno: | Hong Kong |
Data del convegno: | July 10-14, 2017 |
Pagina iniziale: | 309 |
Pagina finale: | 314 |
Citazione: | Visual Saliency for Image Captioning in New Multimedia Services / Cornia, Marcella; Baraldi, Lorenzo; Serra, Giuseppe; Cucchiara, Rita. - (2017), pp. 309-314. ((Intervento presentato al convegno 2017 IEEE International Conference on Multimedia and Expo Workshops tenutosi a Hong Kong nel July 10-14, 2017. |
Tipologia | Relazione in Atti di Convegno |
File in questo prodotto:
File | Descrizione | Tipologia | |
---|---|---|---|
main.pdf | Post-print dell'autore (bozza post referaggio) | Open Access Visualizza/Apri |

I documenti presenti in Iris Unimore sono rilasciati con licenza Creative Commons Attribuzione - Non commerciale - Non opere derivate 3.0 Italia, salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris