Gradient Coding with Clustering and Multi-Message Communication

Gradient descent (GD) methods are commonly employed in machine learning problems to optimize the parameters of the model in an iterative fashion. For problems with massive datasets, computations are distributed to many parallel computing servers (i.e., workers) to speed up GD iterations. While distributed computing can increase the computation speed significantly, the per-iteration completion time is limited by the slowest straggling workers. Coded distributed computing can mitigate straggling workers by introducing redundant computations; however, existing coded computing schemes are mainly designed against persistent stragglers, and partial computations at straggling workers are discarded, leading to wasted computational capacity. In this paper, we propose a novel gradient coding (GC) scheme which allows multiple coded computations to be conveyed from each worker to the master per iteration. We numerically show that the proposed GC with multi-message communication (MMC) together with clustering provides significant improvements in the average completion time (of each iteration), with minimal or no increase in the communication load.

Gradient Coding with Clustering and Multi-Message Communication / Ozfatura, E.; Gunduz, D.; Ulukus, S.. - (2019), pp. 42-46. (Intervento presentato al convegno 2019 IEEE Data Science Workshop, DSW 2019 tenutosi a usa nel 2019) [10.1109/DSW.2019.8755563].

Gradient Coding with Clustering and Multi-Message Communication

E. Ozfatura;D. Gunduz;S. Ulukus

2019

Abstract

Gradient descent (GD) methods are commonly employed in machine learning problems to optimize the parameters of the model in an iterative fashion. For problems with massive datasets, computations are distributed to many parallel computing servers (i.e., workers) to speed up GD iterations. While distributed computing can increase the computation speed significantly, the per-iteration completion time is limited by the slowest straggling workers. Coded distributed computing can mitigate straggling workers by introducing redundant computations; however, existing coded computing schemes are mainly designed against persistent stragglers, and partial computations at straggling workers are discarded, leading to wasted computational capacity. In this paper, we propose a novel gradient coding (GC) scheme which allows multiple coded computations to be conveyed from each worker to the master per iteration. We numerically show that the proposed GC with multi-message communication (MMC) together with clustering provides significant improvements in the average completion time (of each iteration), with minimal or no increase in the communication load.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2019
			
	Titolo del Convegno
	
				2019 IEEE Data Science Workshop, DSW 2019
			
	Luogo del Convegno
	
				usa
			
	Data del Convegno
	
				2019
			
	Codice DOI
	
				https://dx.doi.org/10.1109/DSW.2019.8755563
			
	Codice WoS
	
				WOS:000483469300009
			
	Codice Scopus
	
				2-s2.0-85069475077
			
	Pagina iniziale
	
				42
			
	Pagina finale
	
				46
			
	Tutti gli autori
	
						Ozfatura, E.; Gunduz, D.; Ulukus, S.
					
	Citazione
	
				Gradient Coding with Clustering and Multi-Message Communication / Ozfatura, E.; Gunduz, D.; Ulukus, S.. - (2019), pp. 42-46. (Intervento presentato al  convegno 2019 IEEE Data Science Workshop, DSW 2019 tenutosi a usa nel 2019) [10.1109/DSW.2019.8755563].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1202716

Citazioni

ND

30

29

social impact