In this paper we promote the idea of using pixel-based models not only for low level vision, but also to extract high level symbolic representations. We use a deep architecture which has the distinctive property of relying on computational units that incorporate classic computer vision invariances and, especially, the scale invariance. The learning algorithm that is proposed, which is based on information theory principles, develops the parameters of the computational units and, at the same time, makes it possible to detect the optimal scale for each pixel. We give experimental evidence of the mechanism of feature extraction at the first level of the hierarchy, which is very much related to SIFT-like features. The comparison shows clearly that, whenever we can rely on the massive availability of training data, the proposed model leads to better performances with respect to SIFT. © 2012 Springer-Verlag.

Information theoretic learning for pixel-based visual agents / Gori, Marco; Melacci, Stefano; Lippi, Marco; Maggini, Marco. - 7577:6(2012), pp. 864-875. (Intervento presentato al convegno 12th European Conference on Computer Vision, ECCV 2012 tenutosi a Florence, ita nel 2012) [10.1007/978-3-642-33783-3_62].

Information theoretic learning for pixel-based visual agents

LIPPI, MARCO;
2012

Abstract

In this paper we promote the idea of using pixel-based models not only for low level vision, but also to extract high level symbolic representations. We use a deep architecture which has the distinctive property of relying on computational units that incorporate classic computer vision invariances and, especially, the scale invariance. The learning algorithm that is proposed, which is based on information theory principles, develops the parameters of the computational units and, at the same time, makes it possible to detect the optimal scale for each pixel. We give experimental evidence of the mechanism of feature extraction at the first level of the hierarchy, which is very much related to SIFT-like features. The comparison shows clearly that, whenever we can rely on the massive availability of training data, the proposed model leads to better performances with respect to SIFT. © 2012 Springer-Verlag.
2012
12th European Conference on Computer Vision, ECCV 2012
Florence, ita
2012
7577
864
875
Gori, Marco; Melacci, Stefano; Lippi, Marco; Maggini, Marco
Information theoretic learning for pixel-based visual agents / Gori, Marco; Melacci, Stefano; Lippi, Marco; Maggini, Marco. - 7577:6(2012), pp. 864-875. (Intervento presentato al convegno 12th European Conference on Computer Vision, ECCV 2012 tenutosi a Florence, ita nel 2012) [10.1007/978-3-642-33783-3_62].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1122656
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 7
social impact