On the effectiveness of OpenMP teams for cluster-based many-core accelerators

With the introduction of more powerful and massively parallel embedded processors, embedded systems are becoming HPC-capable. Heterogeneous on-chip systems (SoC) that couple a general-purposehost processor to a many-core accelerator are becoming more and more widespread, and provide tremendous peak performance/watt, well suited to execute HPC-class programs. The increased computation potential is however traded off for ease programming. Application developers are indeed required to manually deal with outlining code parts suitable for acceleration, parallelize them efficiently over many available cores, and orchestrate data transfers to/from the accelerator. In addition, since most many-cores are organized as a collection ofclusters, featuring fast local communication but slow remote communication (i.e., to another cluster's local memory), the programmer should also take care of properly mapping the parallel computation so as to avoid poor data locality. OpenMP v4.0 introduces new constructs for computation offloading, as well as directives to deploy parallel computation in a cluster-aware manner. In this paper we assess the effectiveness of OpenMP v4.0 at exploiting the massive parallelism available in embedded heterogeneous SoCs, comparing to standard parallel loops over several computation-intensive applications from the linear algebra and image processing domains.

On the effectiveness of OpenMP teams for cluster-based many-core accelerators / Capotondi, Alessandro; Marongiu, Andrea. - ELETTRONICO. - (2016), pp. 667-674. (Intervento presentato al convegno 14th International Conference on High Performance Computing and Simulation, HPCS 2016 tenutosi a Innsbruck nel 2016) [10.1109/HPCSim.2016.7568399].

On the effectiveness of OpenMP teams for cluster-based many-core accelerators

CAPOTONDI, ALESSANDRO;MARONGIU, ANDREA

2016

Abstract

With the introduction of more powerful and massively parallel embedded processors, embedded systems are becoming HPC-capable. Heterogeneous on-chip systems (SoC) that couple a general-purposehost processor to a many-core accelerator are becoming more and more widespread, and provide tremendous peak performance/watt, well suited to execute HPC-class programs. The increased computation potential is however traded off for ease programming. Application developers are indeed required to manually deal with outlining code parts suitable for acceleration, parallelize them efficiently over many available cores, and orchestrate data transfers to/from the accelerator. In addition, since most many-cores are organized as a collection ofclusters, featuring fast local communication but slow remote communication (i.e., to another cluster's local memory), the programmer should also take care of properly mapping the parallel computation so as to avoid poor data locality. OpenMP v4.0 introduces new constructs for computation offloading, as well as directives to deploy parallel computation in a cluster-aware manner. In this paper we assess the effectiveness of OpenMP v4.0 at exploiting the massive parallelism available in embedded heterogeneous SoCs, comparing to standard parallel loops over several computation-intensive applications from the linear algebra and image processing domains.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2016
			
	Titolo del Convegno
	
				14th International Conference on High Performance Computing and Simulation, HPCS 2016
			
	Luogo del Convegno
	
				Innsbruck
			
	Data del Convegno
	
				2016
			
	Codice DOI
	
				https://dx.doi.org/10.1109/HPCSim.2016.7568399
			
	Codice WoS
	
				WOS:000389590600091
			
	Codice Scopus
	
				2-s2.0-84991628563
			
	Pagina iniziale
	
				667
			
	Pagina finale
	
				674
			
	Tutti gli autori
	
						Capotondi, Alessandro; Marongiu, Andrea
					
	Citazione
	
				On the effectiveness of OpenMP teams for cluster-based many-core accelerators / Capotondi, Alessandro; Marongiu, Andrea. - ELETTRONICO. - (2016), pp. 667-674. (Intervento presentato al  convegno 14th International Conference on High Performance Computing and Simulation, HPCS 2016 tenutosi a Innsbruck nel 2016) [10.1109/HPCSim.2016.7568399].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
capotondi_HPCS16.pdf Open access Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 341.2 kB Formato Adobe PDF Visualizza/Apri	341.2 kB	Adobe PDF	Visualizza/Apri
On_the_effectiveness_of_OpenMP_teams_for_cluster-based_many-core_accelerators.pdf Accesso riservato Tipologia: Versione pubblicata dall'editore Dimensione 508.53 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	508.53 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1171921

Citazioni

ND

3

1

social impact