Efficient Implementation of Genetic Algorithms on GP-GPU with Scheduled Persistent CUDA Threads

Capodieci, Nicola; Burgio, Paolo

doi:10.1109/PAAP.2015.13

In this paper we present a heavily exploration oriented implementation of genetic algorithms to be executed on graphic processor units (GPUs) that is optimized with our novel mechanism for scheduling GPU-side synchronized jobs that takes inspiration from the concept of persistent threads. Persistent Threads allow an efficient distribution of work loads throughout the GPU so to fully exploit the CUDA (NVIDIA's proprietary Compute Unified Device Architecture) architecture. Our approach (named Scheduled Light Kernel, SLK) uses a specifically designed data structure for issuing sequences of commands from the CPU to the GPU able to minimize CPUGPU communications, exploit streams of concurrent execution of different device side functions within different Streaming Multiprocessors and minimize kernels launch overhead. Results obtained on two completely different experimental settings show that our approach is able to dramatically increase the performance of the tested genetic algorithms compared to the baseline implementation that (while still running on a GPU) does not exploit our proposed approach. Our proposed SLK approach does not require substantial code rewriting and is also compared to newly introduced features in the last CUDA development toolkit, such as nested kernel invocations for dynamic parallelism.

Efficient Implementation of Genetic Algorithms on GP-GPU with Scheduled Persistent CUDA Threads / Capodieci, N., Burgio, P.. - 2016-:(2015), pp. 6-12. (7th International Symposium on Parallel Architectures, Algorithms, and Programming, PAAP 2015 Nanjing (China) 12-14 Dicembre 2015) [10.1109/PAAP.2015.13].

Efficient Implementation of Genetic Algorithms on GP-GPU with Scheduled Persistent CUDA Threads

CAPODIECI, NICOLA;BURGIO, PAOLO

2015

Abstract

In this paper we present a heavily exploration oriented implementation of genetic algorithms to be executed on graphic processor units (GPUs) that is optimized with our novel mechanism for scheduling GPU-side synchronized jobs that takes inspiration from the concept of persistent threads. Persistent Threads allow an efficient distribution of work loads throughout the GPU so to fully exploit the CUDA (NVIDIA's proprietary Compute Unified Device Architecture) architecture. Our approach (named Scheduled Light Kernel, SLK) uses a specifically designed data structure for issuing sequences of commands from the CPU to the GPU able to minimize CPUGPU communications, exploit streams of concurrent execution of different device side functions within different Streaming Multiprocessors and minimize kernels launch overhead. Results obtained on two completely different experimental settings show that our approach is able to dramatically increase the performance of the tested genetic algorithms compared to the baseline implementation that (while still running on a GPU) does not exploit our proposed approach. Our proposed SLK approach does not require substantial code rewriting and is also compared to newly introduced features in the last CUDA development toolkit, such as nested kernel invocations for dynamic parallelism.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2015
			
	Data di prima pubblicazione
	
				2015
			
	Titolo del Convegno
	
				7th International Symposium on Parallel Architectures, Algorithms, and Programming, PAAP 2015
			
	Luogo del Convegno
	
				Nanjing (China)
			
	Data del Convegno
	
				12-14 Dicembre 2015
			
	Codice DOI
	
				https://dx.doi.org/10.1109/PAAP.2015.13
			
	Codice WoS
	
				WOS:000380466400002
			
	Codice Scopus
	
				2-s2.0-84962290520
			
	N° del Volume
	
				2016-
			
	Pagina iniziale
	
				6
			
	Pagina finale
	
				12
			
	Tutti gli autori
	
						Capodieci, Nicola; Burgio, Paolo
					
	Citazione
	
				Efficient Implementation of Genetic Algorithms on GP-GPU with Scheduled Persistent CUDA Threads / Capodieci, N., Burgio, P.. - 2016-:(2015), pp. 6-12. (7th International Symposium on Parallel Architectures, Algorithms, and Programming, PAAP 2015 Nanjing (China) 12-14 Dicembre 2015) [10.1109/PAAP.2015.13].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
File.pdf Accesso riservato Descrizione: Articolo da IEEExplore Tipologia: VOR - Versione pubblicata dall'editore Dimensione 243.49 kB Formato Adobe PDF Visualizza/Apri Richiedi una copia	243.49 kB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris