Many-core heterogeneous designs are nowadays widely available among embedded systems. Initiatives such as the HSA push for a model where the host processor and the accelerator(s) communicate via coherent, Uniied Virtual Memory (UVM). In this paper we describe our experience in porting the OpenMP v4 programming model to a low-end, heterogeneous embedded system based on the PULP many-core accelerator featuring lightweight (software-managed) UVM support. We describe a GCC-based toolchain which enables: i) the automatic generation of host and accelerator binaries from a single, high-level, OpenMP parallel program; ii) the automatic instrumentation of the accelerator program to transparently manage UVM. This enables up to 4× faster execution compared to traditional copy-based oload mechanisms.

Enabling zero-copy OpenMP ofloading on the PULP many-core accelerator / Capotondi, Alessandro; Marongiu, Andrea. - ELETTRONICO. - (2017), pp. 68-71. (Intervento presentato al convegno 20th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2017 tenutosi a Schloss Rheinfels, deu nel 2017) [10.1145/3078659.3079071].

Enabling zero-copy OpenMP ofloading on the PULP many-core accelerator

Capotondi, Alessandro;Marongiu, Andrea
2017

Abstract

Many-core heterogeneous designs are nowadays widely available among embedded systems. Initiatives such as the HSA push for a model where the host processor and the accelerator(s) communicate via coherent, Uniied Virtual Memory (UVM). In this paper we describe our experience in porting the OpenMP v4 programming model to a low-end, heterogeneous embedded system based on the PULP many-core accelerator featuring lightweight (software-managed) UVM support. We describe a GCC-based toolchain which enables: i) the automatic generation of host and accelerator binaries from a single, high-level, OpenMP parallel program; ii) the automatic instrumentation of the accelerator program to transparently manage UVM. This enables up to 4× faster execution compared to traditional copy-based oload mechanisms.
2017
20th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2017
Schloss Rheinfels, deu
2017
68
71
Capotondi, Alessandro; Marongiu, Andrea
Enabling zero-copy OpenMP ofloading on the PULP many-core accelerator / Capotondi, Alessandro; Marongiu, Andrea. - ELETTRONICO. - (2017), pp. 68-71. (Intervento presentato al convegno 20th International Workshop on Software and Compilers for Embedded Systems, SCOPES 2017 tenutosi a Schloss Rheinfels, deu nel 2017) [10.1145/3078659.3079071].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1171890
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? ND
social impact