End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant challenges, including negative interference - where the model learns conflicting scene-specific parameters - and limited domain generalization, which often necessitates expensive fine-tuning to adapt the models to new domains. In response to these challenges, we introduce Parameter-efficient Scenario-specific Tracking Architecture (PASTA), a novel framework that combines Parameter-Efficient Fine-Tuning (PEFT) and Modular Deep Learning (MDL). Specifically, we define key scenario attributes (e.g, camera-viewpoint, lighting condition) and train specialized PEFT modules for each attribute. These expert modules are combined in parameter space, enabling systematic generalization to new domains without increasing inference time. Extensive experiments on MOTSynth, along with zero-shot evaluations on MOT17 and PersonPath22 demonstrate that a neural tracker built from carefully selected modules surpasses its monolithic counterpart. We release models and code.

Is Multiple Object Tracking a Matter of Specialization? / Mancusi, Gianluca; Bernardi, Mattia; Panariello, Aniello; Porrello, Angelo; Cucchiara, Rita; Calderara, Simone. - (2024). (Intervento presentato al convegno The Thirty-Eighth Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel Tuesday Dec 10 through Sunday Dec 15 2024).

Is Multiple Object Tracking a Matter of Specialization?

Gianluca Mancusi;Mattia Bernardi;Aniello Panariello;Angelo Porrello;Rita Cucchiara;Simone Calderara
2024

Abstract

End-to-end transformer-based trackers have achieved remarkable performance on most human-related datasets. However, training these trackers in heterogeneous scenarios poses significant challenges, including negative interference - where the model learns conflicting scene-specific parameters - and limited domain generalization, which often necessitates expensive fine-tuning to adapt the models to new domains. In response to these challenges, we introduce Parameter-efficient Scenario-specific Tracking Architecture (PASTA), a novel framework that combines Parameter-Efficient Fine-Tuning (PEFT) and Modular Deep Learning (MDL). Specifically, we define key scenario attributes (e.g, camera-viewpoint, lighting condition) and train specialized PEFT modules for each attribute. These expert modules are combined in parameter space, enabling systematic generalization to new domains without increasing inference time. Extensive experiments on MOTSynth, along with zero-shot evaluations on MOT17 and PersonPath22 demonstrate that a neural tracker built from carefully selected modules surpasses its monolithic counterpart. We release models and code.
2024
The Thirty-Eighth Annual Conference on Neural Information Processing Systems
Vancouver
Tuesday Dec 10 through Sunday Dec 15 2024
Mancusi, Gianluca; Bernardi, Mattia; Panariello, Aniello; Porrello, Angelo; Cucchiara, Rita; Calderara, Simone
Is Multiple Object Tracking a Matter of Specialization? / Mancusi, Gianluca; Bernardi, Mattia; Panariello, Aniello; Porrello, Angelo; Cucchiara, Rita; Calderara, Simone. - (2024). (Intervento presentato al convegno The Thirty-Eighth Annual Conference on Neural Information Processing Systems tenutosi a Vancouver nel Tuesday Dec 10 through Sunday Dec 15 2024).
File in questo prodotto:
File Dimensione Formato  
paper_pasta.pdf

Open access

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 8.77 MB
Formato Adobe PDF
8.77 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1364431
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact