Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

Zahaf, H. -E.; Sanudo Olmedo, I. S.; Singh, J.; Capodieci, N.; Faucou, S.

doi:10.1145/3453417.3453439

In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made available in novel GPU architectures is steadily increasing, hence, investigating suitable scheduling approaches is now mandatory. Such scheduling approaches are related to mapping different and concurrent compute kernels within the GPU computing clusters, hence grouping GPU computing clusters into schedulable partitions. In this paper we propose novel techniques to define GPU partitions; this allows us to define suitable task-to-partition allocation mechanisms in which tasks are GPU compute kernels featuring different timing requirements. Such mechanisms will take into account the interference that GPU kernels experience when running in overlapping time windows. Hence, an effective and simple way to quantify the magnitude of such interference is also presented. We demonstrate the efficiency of the proposed approaches against the classical techniques that considered the GPU as a single, non-partitionable resource.

Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads / Zahaf, H. -E.; Sanudo Olmedo, I. S.; Singh, J.; Capodieci, N.; Faucou, S.. - (2021), pp. 226-236. ( 29th International Conference on Real-Time Networks and Systems, RTNS 2021 fra 2021) [10.1145/3453417.3453439].

Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads

Zahaf H. -E.;Sanudo Olmedo I. S.;Singh J.;Capodieci N.;Faucou S.

2021

Abstract

In order to satisfy timing constraints, modern real-time applications require massively parallel accelerators such as General Purpose Graphic Processing Units (GPGPUs). Generation after generation, the number of computing clusters made available in novel GPU architectures is steadily increasing, hence, investigating suitable scheduling approaches is now mandatory. Such scheduling approaches are related to mapping different and concurrent compute kernels within the GPU computing clusters, hence grouping GPU computing clusters into schedulable partitions. In this paper we propose novel techniques to define GPU partitions; this allows us to define suitable task-to-partition allocation mechanisms in which tasks are GPU compute kernels featuring different timing requirements. Such mechanisms will take into account the interference that GPU kernels experience when running in overlapping time windows. Hence, an effective and simple way to quantify the magnitude of such interference is also presented. We demonstrate the efficiency of the proposed approaches against the classical techniques that considered the GPU as a single, non-partitionable resource.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Titolo del Convegno
	
				29th International Conference on Real-Time Networks and Systems, RTNS 2021
			
	Luogo del Convegno
	
				fra
			
	Data del Convegno
	
				2021
			
	Codice DOI
	
				https://dx.doi.org/10.1145/3453417.3453439
			
	Codice WoS
	
				WOS:000933139900022
			
	Codice Scopus
	
				2-s2.0-85111976043
			
	Pagina iniziale
	
				226
			
	Pagina finale
	
				236
			
	Tutti gli autori
	
						Zahaf, H. -E.; Sanudo Olmedo, I. S.; Singh, J.; Capodieci, N.; Faucou, S.
					
	Citazione
	
				Contention-Aware GPU Partitioning and Task-to-Partition Allocation for Real-Time Workloads / Zahaf, H. -E.; Sanudo Olmedo, I. S.; Singh, J.; Capodieci, N.; Faucou, S.. - (2021), pp. 226-236. ( 29th International Conference on Real-Time Networks and Systems, RTNS 2021 fra 2021) [10.1145/3453417.3453439].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris