The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it is an enabling technology in automotive for driver attention monitoring. In this paper, we tackle the pose estimation problem through a deep learning network working in regression manner. Traditional methods usually rely on visual facial features, such as facial landmarks or nose tip position. In contrast, we exploit a Convolutional Neural Network (CNN) to perform head pose estimation directly from depth data. We exploit a Siamese architecture and we propose a novel loss function to improve the learning of the regression network layer. The system has been tested on two public datasets, Biwi Kinect Head Pose and ICT-3DHP database. The reported results demonstrate the improvement in accuracy with respect to current state-of-the-art approaches and the real time capabilities of the overall framework.

From Depth Data to Head Pose Estimation: a Siamese approach / Venturelli, Marco; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita. - 5:(2017), pp. 194-201. ((Intervento presentato al convegno 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2017) tenutosi a Porto, Portugal nel 27 february - 1 march, 2017 [10.5220/0006104501940201].

From Depth Data to Head Pose Estimation: a Siamese approach

BORGHI, GUIDO;VEZZANI, Roberto;CUCCHIARA, Rita
2017-01-01

Abstract

The correct estimation of the head pose is a problem of the great importance for many applications. For instance, it is an enabling technology in automotive for driver attention monitoring. In this paper, we tackle the pose estimation problem through a deep learning network working in regression manner. Traditional methods usually rely on visual facial features, such as facial landmarks or nose tip position. In contrast, we exploit a Convolutional Neural Network (CNN) to perform head pose estimation directly from depth data. We exploit a Siamese architecture and we propose a novel loss function to improve the learning of the regression network layer. The system has been tested on two public datasets, Biwi Kinect Head Pose and ICT-3DHP database. The reported results demonstrate the improvement in accuracy with respect to current state-of-the-art approaches and the real time capabilities of the overall framework.
12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2017)
Porto, Portugal
27 february - 1 march, 2017
5
194
201
Venturelli, Marco; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita
From Depth Data to Head Pose Estimation: a Siamese approach / Venturelli, Marco; Borghi, Guido; Vezzani, Roberto; Cucchiara, Rita. - 5:(2017), pp. 194-201. ((Intervento presentato al convegno 12th International Joint Conference on Computer Vision, Imaging and Computer Graphics Theory and Applications (VISAPP 2017) tenutosi a Porto, Portugal nel 27 february - 1 march, 2017 [10.5220/0006104501940201].
File in questo prodotto:
File Dimensione Formato  
VISAPP_2017_63.pdf

Open access

Tipologia: Versione pubblicata dall'editore
Dimensione 1.5 MB
Formato Adobe PDF
1.5 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1118253
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 13
  • ???jsp.display-item.citation.isi??? 12
social impact