Face-from-Depth for Head Pose Estimation on Depth Images

Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image. We empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes several recent state-of-art works based on both intensity and depth input data, running in real-time at more than 30 frames per second.

Face-from-Depth for Head Pose Estimation on Depth Images / Borghi, Guido; Fabbri, Matteo; Vezzani, Roberto; Calderara, Simone; Cucchiara, Rita. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 42:3(2020), pp. 596-609. [10.1109/TPAMI.2018.2885472]

Face-from-Depth for Head Pose Estimation on Depth Images

Guido Borghi;Matteo Fabbri;Roberto Vezzani;Simone Calderara;Rita Cucchiara

2020

Abstract

Depth cameras allow to set up reliable solutions for people monitoring and behavior understanding, especially when unstable or poor illumination conditions make unusable common RGB sensors. Therefore, we propose a complete framework for the estimation of the head and shoulder pose based on depth images only. A head detection and localization module is also included, in order to develop a complete end-to-end system. The core element of the framework is a Convolutional Neural Network, called POSEidon+, that receives as input three types of images and provides the 3D angles of the pose as output. Moreover, a Face-from-Depth component based on a Deterministic Conditional GAN model is able to hallucinate a face from the corresponding depth image. We empirically demonstrate that this positively impacts the system performances. We test the proposed framework on two public datasets, namely Biwi Kinect Head Pose and ICT-3DHP, and on Pandora, a new challenging dataset mainly inspired by the automotive setup. Experimental results show that our method overcomes several recent state-of-art works based on both intensity and depth input data, running in real-time at more than 30 frames per second.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Data di prima pubblicazione
	
				7-dic-2018
			
	Rivista
	
				IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE
			
	N° del Volume
	
				42
			
	Fascicolo
	
				3
			
	Pagina iniziale
	
				596
			
	Pagina finale
	
				609
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TPAMI.2018.2885472
			
	Codice WoS
	
				WOS:000525365300006
			
	Codice Scopus
	
				2-s2.0-85058148465
			
	Codice PubMed
	
				30530311
			
	Citazione
	
				Face-from-Depth for Head Pose Estimation on Depth Images / Borghi, Guido; Fabbri, Matteo; Vezzani, Roberto; Calderara, Simone; Cucchiara, Rita. - In: IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. - ISSN 0162-8828. - 42:3(2020), pp. 596-609. [10.1109/TPAMI.2018.2885472]
			
	Tutti gli autori
	
						Borghi, Guido; Fabbri, Matteo; Vezzani, Roberto; Calderara, Simone; Cucchiara, Rita
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
pre-printTPAMIVEZZANI.pdf Open access Tipologia: AO - Versione originale dell'autore proposta per la pubblicazione Dimensione 3.73 MB Formato Adobe PDF Visualizza/Apri	3.73 MB	Adobe PDF	Visualizza/Apri
08567956.pdf Accesso riservato Tipologia: VOR - Versione pubblicata dall'editore Dimensione 5.12 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	5.12 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1167738

Citazioni

4

68

57

social impact