How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting

Accurate prediction of future human positions is an essential task for modern video-surveillance systems. Current state-of-the-art models usually rely on a "history" of past tracked locations (e.g., 3 to 5 seconds) to predict a plausible sequence of future locations (e.g., up to the next 5 seconds). We feel that this common schema neglects critical traits of realistic applications: as the collection of input trajectories involves machine perception (i.e., detection and tracking), incorrect detection and fragmentation errors may accumulate in crowded scenes, leading to tracking drifts. On this account, the model would be fed with corrupted and noisy input data, thus fatally affecting its prediction performance.In this regard, we focus on delivering accurate predictions when only few input observations are used, thus potentially lowering the risks associated with automatic perception. To this end, we conceive a novel distillation strategy that allows a knowledge transfer from a teacher network to a student one, the latter fed with fewer observations (just two ones). We show that a properly defined teacher supervision allows a student network to perform comparably to state-of-the-art approaches that demand more observations. Besides, extensive experiments on common trajectory forecasting datasets highlight that our student network better generalizes to unseen scenarios.

How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting / Monti, A.; Porrello, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R.. - 2022-June:(2022), pp. 6543-6552. (Intervento presentato al convegno 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 tenutosi a New Orleans USA nel 19/06/2022) [10.1109/CVPR52688.2022.00644].

How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting

Monti A.;Porrello A.;Calderara S.;Coscia P.;Ballan L.;Cucchiara R.

2022

Abstract

Accurate prediction of future human positions is an essential task for modern video-surveillance systems. Current state-of-the-art models usually rely on a "history" of past tracked locations (e.g., 3 to 5 seconds) to predict a plausible sequence of future locations (e.g., up to the next 5 seconds). We feel that this common schema neglects critical traits of realistic applications: as the collection of input trajectories involves machine perception (i.e., detection and tracking), incorrect detection and fragmentation errors may accumulate in crowded scenes, leading to tracking drifts. On this account, the model would be fed with corrupted and noisy input data, thus fatally affecting its prediction performance.In this regard, we focus on delivering accurate predictions when only few input observations are used, thus potentially lowering the risks associated with automatic perception. To this end, we conceive a novel distillation strategy that allows a knowledge transfer from a teacher network to a student one, the latter fed with fewer observations (just two ones). We show that a properly defined teacher supervision allows a student network to perform comparably to state-of-the-art approaches that demand more observations. Besides, extensive experiments on common trajectory forecasting datasets highlight that our student network better generalizes to unseen scenarios.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2022
			
	Titolo del Convegno
	
				2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022
			
	Luogo del Convegno
	
				New Orleans USA
			
	Data del Convegno
	
				19/06/2022
			
	Codice DOI
	
				https://dx.doi.org/10.1109/CVPR52688.2022.00644
			
	Codice WoS
	
				WOS:000867754206079
			
	Codice Scopus
	
				2-s2.0-85136124541
			
	Serie
	
				PROCEEDINGS IEEE COMPUTER SOCIETY CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION
			
	N° del Volume
	
				2022-June
			
	Pagina iniziale
	
				6543
			
	Pagina finale
	
				6552
			
	Tutti gli autori
	
						Monti, A.; Porrello, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R.
					
	Citazione
	
				How many Observations are Enough? Knowledge Distillation for Trajectory Forecasting / Monti, A.; Porrello, A.; Calderara, S.; Coscia, P.; Ballan, L.; Cucchiara, R.. - 2022-June:(2022), pp. 6543-6552. (Intervento presentato al  convegno 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022 tenutosi a New Orleans USA nel 19/06/2022) [10.1109/CVPR52688.2022.00644].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
Monti_How_Many_Observations_Are_Enough_Knowledge_Distillation_for_Trajectory_Forecasting_CVPR_2022_paper.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Dimensione 1.46 MB Formato Adobe PDF Visualizza/Apri	1.46 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1317187

Citazioni

ND

41

19

social impact