Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation

In this paper we present a novel approach for bottom-up multi-person 3D human pose estimation from monocular RGB images. We propose to use high resolution volumetric heatmaps to model joint locations, devising a simple and effective compression method to drastically reduce the size of this representation. At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation. A second model, the Code Predictor, is then trained to predict these codes, which can be decompressed at test time to re-obtain the original representation. Our experimental evaluation shows that our method performs favorably when compared to state of the art on both multi-person and single-person 3D human pose estimation datasets and, thanks to our novel compression strategy, can process full-HD images at the constant runtime of 8 fps regardless of the number of subjects in the scene.

In this paper we present a novel approach for bottom-up multi-person 3D human pose estimation from monocular RGB images. We propose to use high resolution volumetric heatmaps to model joint locations, devising a simple and effective compression method to drastically reduce the size of this representation. At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation. A second model, the Code Predictor, is then trained to predict these codes, which can be decompressed at test time to re-obtain the original representation. Our experimental evaluation shows that our method performs favorably when compared to state of the art on both multi-person and single-person 3D human pose estimation datasets and, thanks to our novel compression strategy, can process full-HD images at the constant runtime of 8 fps regardless of the number of subjects in the scene. Code and models are publicly available.

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation / Fabbri, Matteo; Lanzi, Fabio; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita. - (2020), pp. 7202-7211. (Intervento presentato al convegno 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 tenutosi a Seattle nel June, 16-18 2020) [10.1109/CVPR42600.2020.00723].

Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation

Matteo Fabbri;Fabio Lanzi;Simone Calderara;Stefano Alletto;Rita Cucchiara

2020

Abstract

In this paper we present a novel approach for bottom-up multi-person 3D human pose estimation from monocular RGB images. We propose to use high resolution volumetric heatmaps to model joint locations, devising a simple and effective compression method to drastically reduce the size of this representation. At the core of the proposed method lies our Volumetric Heatmap Autoencoder, a fully-convolutional network tasked with the compression of ground-truth heatmaps into a dense intermediate representation. A second model, the Code Predictor, is then trained to predict these codes, which can be decompressed at test time to re-obtain the original representation. Our experimental evaluation shows that our method performs favorably when compared to state of the art on both multi-person and single-person 3D human pose estimation datasets and, thanks to our novel compression strategy, can process full-HD images at the constant runtime of 8 fps regardless of the number of subjects in the scene. Code and models are publicly available.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo del Convegno
	
				2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020
			
	Luogo del Convegno
	
				Seattle
			
	Data del Convegno
	
				June, 16-18 2020
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CVPR42600.2020.00723
			
	Codice WoS
	
				WOS:000620679507048
			
	Codice Scopus
	
				2-s2.0-85094852283
			
	Serie
	
				PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
			
	Pagina iniziale
	
				7202
			
	Pagina finale
	
				7211
			
	Tutti gli autori
	
						Fabbri, Matteo; Lanzi, Fabio; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita
					
	Citazione
	
				Compressed Volumetric Heatmaps for Multi-Person 3D Pose Estimation / Fabbri, Matteo; Lanzi, Fabio; Calderara, Simone; Alletto, Stefano; Cucchiara, Rita. - (2020), pp. 7202-7211. (Intervento presentato al  convegno 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020 tenutosi a Seattle nel June, 16-18 2020) [10.1109/CVPR42600.2020.00723].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
main.pdf Open access Descrizione: Articolo principale e Supplementary Material Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 7.88 MB Formato Adobe PDF Visualizza/Apri	7.88 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1206226

Citazioni

ND

87

57

social impact