Several many-core designs tackle scalability issues by leveraging tightly-coupled clusters as building blocks, where low-latency, high-bandwidth interconnection between a small/medium number of cores and L1 memory achieves high performance/watt. Tight coupling of hardware accelerators into these multicore clusters constitutes a promising approach to further improve performance/area/watt. However, accelerators are often clocked at a lower frequency than processor clusters for energy efficiency reasons. In this paper, we propose a technique to integrate shared-memory accelerators within the tightly-coupled clusters of the STMicroelectronics STHORM architecture. Our methodology significantly relaxes timing constraints for tightly-coupled accelerators, while optimizing data bandwidth. In addition, our technique allows to operate the accelerator at an integer submultiple of the cluster frequency. Experimental results show that the proposed approach allows to recover up to 84% of the slow-down implied by reduced accelerator speed.
Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters / Conti, Francesco; Marongiu, Andrea; Benini, Luca. - STAMPA. - (2013), pp. 1-10. (Intervento presentato al convegno Hardware/Software Codesign and System Synthesis (CODES+ISSS), 2013 International Conference on tenutosi a Montreal, QC nel Sept. 29 2013-Oct. 4 2013) [10.1109/CODES-ISSS.2013.6658992].
Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters
Andrea Marongiu;
2013
Abstract
Several many-core designs tackle scalability issues by leveraging tightly-coupled clusters as building blocks, where low-latency, high-bandwidth interconnection between a small/medium number of cores and L1 memory achieves high performance/watt. Tight coupling of hardware accelerators into these multicore clusters constitutes a promising approach to further improve performance/area/watt. However, accelerators are often clocked at a lower frequency than processor clusters for energy efficiency reasons. In this paper, we propose a technique to integrate shared-memory accelerators within the tightly-coupled clusters of the STMicroelectronics STHORM architecture. Our methodology significantly relaxes timing constraints for tightly-coupled accelerators, while optimizing data bandwidth. In addition, our technique allows to operate the accelerator at an integer submultiple of the cluster frequency. Experimental results show that the proposed approach allows to recover up to 84% of the slow-down implied by reduced accelerator speed.File | Dimensione | Formato | |
---|---|---|---|
Synthesis-friendly techniques for tightly-coupled integration of hardware accelerators into shared-memory multi-core clusters.pdf
Accesso riservato
Dimensione
841.44 kB
Formato
Adobe PDF
|
841.44 kB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris