High-performance embedded platforms are increasingly adopting heterogeneous systems-on-chip (HeSoC) that couple multi-core CPUs with accelerators such as GPU, FPGA, or AI engines. Adopting HeSoCs in the context of real-time workloads is not immediately possible, though, as contention on shared resources like the memory hierarchy—and in particular the main memory (DRAM)—causes unpredictable latency increase. To tackle this problem, both the research community and certification authorities mandate (i) that accesses from parallel threads to the shared system resources (typically, main memory) happen in a mutually exclusive manner by design, or (ii) that per-thread bandwidth regulation is enforced. Such arbitration schemes provide timing guarantees, but make poor use of the memory bandwidth available in a modern HeSoC. Controlled Memory Request Injection (CMRI) is a recently-proposed bandwidth limitation concept that builds on top of a mutually-exclusive schedule but still allows the threads currently not entitled to access memory to use as much of the unused bandwidth as possible without losing the timing guarantee. CMRI has been discussed in the context of a multi-core CPU, but the same principle applies also to a more complex system such as an HeSoC. In this article, we introduce two CMRI schemes suitable for HeSoCs: Voluntary Throttling via code refactoring and Bandwidth Regulation via dynamic throttling. We extensively characterize a proof-of-concept incarnation of both schemes on two HeSoCs: an NVIDIA Tegra TX2 and a Xilinx UltraScale+, highlighting the benefits and the costs of CMRI for synthetic workloads that model worst-case DRAM access. We also test the effectiveness of CMRI with real benchmarks, studying the effect of interference among the host CPU and the accelerators.

Evaluating Controlled Memory Request Injection for Efficient Bandwidth Utilization and Predictable Execution in Heterogeneous SoCs / Brilli, Gianluca; Cavicchioli, Roberto; Solieri, Marco; Valente, Paolo; Marongiu, Andrea. - In: ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS. - ISSN 1558-3465. - 22:1(2022), pp. 1-25. [10.1145/3548773]

Evaluating Controlled Memory Request Injection for Efficient Bandwidth Utilization and Predictable Execution in Heterogeneous SoCs

Brilli Gianluca;Cavicchioli Roberto
;
Solieri Marco;Valente Paolo;Marongiu Andrea
2022

Abstract

High-performance embedded platforms are increasingly adopting heterogeneous systems-on-chip (HeSoC) that couple multi-core CPUs with accelerators such as GPU, FPGA, or AI engines. Adopting HeSoCs in the context of real-time workloads is not immediately possible, though, as contention on shared resources like the memory hierarchy—and in particular the main memory (DRAM)—causes unpredictable latency increase. To tackle this problem, both the research community and certification authorities mandate (i) that accesses from parallel threads to the shared system resources (typically, main memory) happen in a mutually exclusive manner by design, or (ii) that per-thread bandwidth regulation is enforced. Such arbitration schemes provide timing guarantees, but make poor use of the memory bandwidth available in a modern HeSoC. Controlled Memory Request Injection (CMRI) is a recently-proposed bandwidth limitation concept that builds on top of a mutually-exclusive schedule but still allows the threads currently not entitled to access memory to use as much of the unused bandwidth as possible without losing the timing guarantee. CMRI has been discussed in the context of a multi-core CPU, but the same principle applies also to a more complex system such as an HeSoC. In this article, we introduce two CMRI schemes suitable for HeSoCs: Voluntary Throttling via code refactoring and Bandwidth Regulation via dynamic throttling. We extensively characterize a proof-of-concept incarnation of both schemes on two HeSoCs: an NVIDIA Tegra TX2 and a Xilinx UltraScale+, highlighting the benefits and the costs of CMRI for synthetic workloads that model worst-case DRAM access. We also test the effectiveness of CMRI with real benchmarks, studying the effect of interference among the host CPU and the accelerators.
2022
13-dic-2022
22
1
1
25
Evaluating Controlled Memory Request Injection for Efficient Bandwidth Utilization and Predictable Execution in Heterogeneous SoCs / Brilli, Gianluca; Cavicchioli, Roberto; Solieri, Marco; Valente, Paolo; Marongiu, Andrea. - In: ACM TRANSACTIONS ON EMBEDDED COMPUTING SYSTEMS. - ISSN 1558-3465. - 22:1(2022), pp. 1-25. [10.1145/3548773]
Brilli, Gianluca; Cavicchioli, Roberto; Solieri, Marco; Valente, Paolo; Marongiu, Andrea
File in questo prodotto:
File Dimensione Formato  
Controlled_Injection.pdf

Open access

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 6.87 MB
Formato Adobe PDF
6.87 MB Adobe PDF Visualizza/Apri
3548773.pdf

Accesso riservato

Tipologia: Versione pubblicata dall'editore
Dimensione 8.28 MB
Formato Adobe PDF
8.28 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1293044
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 1
  • ???jsp.display-item.citation.isi??? 0
social impact