The ever-increasing need for computational power in embedded devices has led to the adoption heterogeneous SoCs combining a general purpose CPU with a data parallel accelerator. These systems rely on a shared main memory (DRAM), which makes them highly susceptible to memory interference. A promising software technique to counter such effects is the Predictable Execution Model (PREM). PREM ensures robustness to interference by separating programs into a sequence of memory and compute phases, and by enforcing a platform-level schedule where only a single processing subsystem is permitted to execute a memory phase at a time. This article demonstrates for the first time how PREM can be applied to heterogeneous SoCs, based on a synchronization technique for memory isolation between CPU and GPU plus a compiler to transform GPU kernels into PREM-compliant codes. For compute bound GPU workloads sharing the DRAM bandwidth 50/50 with the CPU we guarantee near-zero timing varibility at a performance loss of just 59 percent, which is one to two orders of magnitude smaller than the worst case we see for unmodified programs under memory interference.

HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs / Forsberg, B.; Benini, L.; Marongiu, A.. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - 70:1(2021), pp. 17-29. [10.1109/TC.2020.2980520]

HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs

Marongiu A.
2021

Abstract

The ever-increasing need for computational power in embedded devices has led to the adoption heterogeneous SoCs combining a general purpose CPU with a data parallel accelerator. These systems rely on a shared main memory (DRAM), which makes them highly susceptible to memory interference. A promising software technique to counter such effects is the Predictable Execution Model (PREM). PREM ensures robustness to interference by separating programs into a sequence of memory and compute phases, and by enforcing a platform-level schedule where only a single processing subsystem is permitted to execute a memory phase at a time. This article demonstrates for the first time how PREM can be applied to heterogeneous SoCs, based on a synchronization technique for memory isolation between CPU and GPU plus a compiler to transform GPU kernels into PREM-compliant codes. For compute bound GPU workloads sharing the DRAM bandwidth 50/50 with the CPU we guarantee near-zero timing varibility at a performance loss of just 59 percent, which is one to two orders of magnitude smaller than the worst case we see for unmodified programs under memory interference.
2021
70
1
17
29
HePREM: A Predictable Execution Model for GPU-based Heterogeneous SoCs / Forsberg, B.; Benini, L.; Marongiu, A.. - In: IEEE TRANSACTIONS ON COMPUTERS. - ISSN 0018-9340. - 70:1(2021), pp. 17-29. [10.1109/TC.2020.2980520]
Forsberg, B.; Benini, L.; Marongiu, A.
File in questo prodotto:
File Dimensione Formato  
HePREM_A_Predictable_Execution_Model_for_GPU-based_Heterogeneous_SoCs.pdf

Accesso riservato

Tipologia: Versione pubblicata dall'editore
Dimensione 1.14 MB
Formato Adobe PDF
1.14 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
tc.2020.2980520.pdf

Open access

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 1.8 MB
Formato Adobe PDF
1.8 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1236916
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 8
  • ???jsp.display-item.citation.isi??? 8
social impact