Modern designs for embedded systems are increasingly embracing cluster-based architectures, where small sets of cores communicate through tightly-coupled shared memory banks and high-performance interconnections. At the same time, the complexity of modern applications requires new programming abstractions to exploit dynamic and/or irregular parallelism on such platforms. Supporting dynamic parallelism in systems which i) are resource-constrained and ii) run applications with small units of work calls for a runtime environment which has minimal overhead for the scheduling of parallel tasks. In this work, we study the major sources of overhead in the implementation of OpenMP dynamic loops, sections and tasks, and propose a hardware implementation of a generic Scheduling Engine (HWSE) which fits the semantics of the three constructs. The HWSE is designed as a tightly-coupled block to the PEs within a multi-core cluster, communicating through a shared-memory interface. This allows very fast programming and synchronization with the controlling PEs, fundamental to achieving fast dynamic scheduling, and ultimately to enable fine-grained parallelism. We prove the effectiveness of our solutions with real applications and synthetic benchmarks, using a cycle-accurate virtual platform.

Tightly-coupled hardware support to dynamic parallelism acceleration in embedded shared memory clusters / Burgio, Paolo; Tagliavini, Giuseppe; Conti, Francesco; Marongiu, Andrea; Benini, Luca. - STAMPA. - (2014), pp. 1-6. (Intervento presentato al convegno 17th Design, Automation and Test in Europe, DATE 2014 tenutosi a Dresden; Germany nel 24 March 2014 through 28 March 2014) [10.7873/DATE2014.169].

Tightly-coupled hardware support to dynamic parallelism acceleration in embedded shared memory clusters

BURGIO, PAOLO;MARONGIU, ANDREA;
2014

Abstract

Modern designs for embedded systems are increasingly embracing cluster-based architectures, where small sets of cores communicate through tightly-coupled shared memory banks and high-performance interconnections. At the same time, the complexity of modern applications requires new programming abstractions to exploit dynamic and/or irregular parallelism on such platforms. Supporting dynamic parallelism in systems which i) are resource-constrained and ii) run applications with small units of work calls for a runtime environment which has minimal overhead for the scheduling of parallel tasks. In this work, we study the major sources of overhead in the implementation of OpenMP dynamic loops, sections and tasks, and propose a hardware implementation of a generic Scheduling Engine (HWSE) which fits the semantics of the three constructs. The HWSE is designed as a tightly-coupled block to the PEs within a multi-core cluster, communicating through a shared-memory interface. This allows very fast programming and synchronization with the controlling PEs, fundamental to achieving fast dynamic scheduling, and ultimately to enable fine-grained parallelism. We prove the effectiveness of our solutions with real applications and synthetic benchmarks, using a cycle-accurate virtual platform.
2014
17th Design, Automation and Test in Europe, DATE 2014
Dresden; Germany
24 March 2014 through 28 March 2014
1
6
Burgio, Paolo; Tagliavini, Giuseppe; Conti, Francesco; Marongiu, Andrea; Benini, Luca
Tightly-coupled hardware support to dynamic parallelism acceleration in embedded shared memory clusters / Burgio, Paolo; Tagliavini, Giuseppe; Conti, Francesco; Marongiu, Andrea; Benini, Luca. - STAMPA. - (2014), pp. 1-6. (Intervento presentato al convegno 17th Design, Automation and Test in Europe, DATE 2014 tenutosi a Dresden; Germany nel 24 March 2014 through 28 March 2014) [10.7873/DATE2014.169].
File in questo prodotto:
File Dimensione Formato  
Tightly-coupled hardware support to dynamic parallelism acceleration in embedded shared memory clusters.pdf

Accesso riservato

Dimensione 186.48 kB
Formato Adobe PDF
186.48 kB Adobe PDF   Visualizza/Apri   Richiedi una copia
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1171893
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 0
social impact