Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner via L1 shared memory. Efficiently integrating processors and a high number of HW Processing Units (HWPUs) in such an system poses two main challenges, namely, architectural scalability and programmability. In this paper we describe an optimized Data Pump (DP) which connects several accelerators to a restricted set of communication ports, and acts as a virtualization layer for programming, exposing FIFO queues to offload “HW tasks” to them through a set of lightweight APIs. In this work, we aim at optimizing both these mechanisms, for respectively reducing modules area and making programming sequence easier and lighter.
A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters / Burgio, Paolo; Danilo, Robin; Marongiu, Andrea; Coussy, Philippe; Benini, Luca. - STAMPA. - (2014), pp. 1-4. (Intervento presentato al convegno 17th Design, Automation and Test in Europe, DATE 2014 tenutosi a Dresden, deu nel 2014) [10.7873/DATE2014.038].
A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters
BURGIO, PAOLO;MARONGIU, ANDREA;
2014
Abstract
Modern designs for embedded many-core systems increasingly include application-specific units to accelerate key computational kernels with orders-of-magnitude higher execution speed and energy efficiency compared to software counterparts. A promising architectural template is based on heterogeneous clusters, where simple RISC cores and specialized HW units (HWPU) communicate in a tightly-coupled manner via L1 shared memory. Efficiently integrating processors and a high number of HW Processing Units (HWPUs) in such an system poses two main challenges, namely, architectural scalability and programmability. In this paper we describe an optimized Data Pump (DP) which connects several accelerators to a restricted set of communication ports, and acts as a virtualization layer for programming, exposing FIFO queues to offload “HW tasks” to them through a set of lightweight APIs. In this work, we aim at optimizing both these mechanisms, for respectively reducing modules area and making programming sequence easier and lighter.File | Dimensione | Formato | |
---|---|---|---|
A tightly-coupled hardware controller to improve scalability and programmability of shared-memory heterogeneous clusters.pdf
Accesso riservato
Dimensione
449.93 kB
Formato
Adobe PDF
|
449.93 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris