Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers

Ozfatura, E.; Gunduz, D.; Ulukus, S.

doi:10.1109/ISIT.2019.8849684

When gradient descent (GD) is scaled to many parallel computing servers (workers) for large scale machine learning problems, its per-iteration computation time is limited by the straggling workers. Coded distributed GD (DGD) can tolerate straggling workers by assigning redundant computations to the workers, but in most existing schemes, each non-straggling worker transmits one message per iteration to the parameter server (master) after completing all its computations. We allow multiple computations to be conveyed from each worker per iteration in order to exploit computations executed also by the straggling worker. We show that the average completion time per iteration can be reduced significantly at a reasonable increase in the communication load. We also propose a general coded DGD technique which can trade-off the average computation time with the communication load.

Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers / Ozfatura, E.; Gunduz, D.; Ulukus, S.. - 2019-:(2019), pp. 2729-2733. (Intervento presentato al convegno 2019 IEEE International Symposium on Information Theory, ISIT 2019 tenutosi a La Maison de La Mutualite, fra nel 2019) [10.1109/ISIT.2019.8849684].

Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers

E. Ozfatura;D. Gunduz;S. Ulukus

2019

Abstract

When gradient descent (GD) is scaled to many parallel computing servers (workers) for large scale machine learning problems, its per-iteration computation time is limited by the straggling workers. Coded distributed GD (DGD) can tolerate straggling workers by assigning redundant computations to the workers, but in most existing schemes, each non-straggling worker transmits one message per iteration to the parameter server (master) after completing all its computations. We allow multiple computations to be conveyed from each worker per iteration in order to exploit computations executed also by the straggling worker. We show that the average completion time per iteration can be reduced significantly at a reasonable increase in the communication load. We also propose a general coded DGD technique which can trade-off the average computation time with the communication load.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2019
			
	Titolo del Convegno
	
				2019 IEEE International Symposium on Information Theory, ISIT 2019
			
	Luogo del Convegno
	
				La Maison de La Mutualite, fra
			
	Data del Convegno
	
				2019
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ISIT.2019.8849684
			
	Codice WoS
	
				WOS:000489100302165
			
	Codice Scopus
	
				2-s2.0-85073170451
			
	N° del Volume
	
				2019-
			
	Pagina iniziale
	
				2729
			
	Pagina finale
	
				2733
			
	Tutti gli autori
	
						Ozfatura, E.; Gunduz, D.; Ulukus, S.
					
	Citazione
	
				Speeding Up Distributed Gradient Descent by Utilizing Non-persistent Stragglers / Ozfatura, E.; Gunduz, D.; Ulukus, S.. - 2019-:(2019), pp. 2729-2733. (Intervento presentato al  convegno 2019 IEEE International Symposium on Information Theory, ISIT 2019 tenutosi a La Maison de La Mutualite, fra nel 2019) [10.1109/ISIT.2019.8849684].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris