In distributed synchronous gradient descent (GD) the main performance bottleneck for the per-iteration completion time is the slowest straggling workers. To speed up GD iterations in the presence of stragglers, coded distributed computation techniques are implemented by assigning redundant computations to workers. In this paper, we propose a novel gradient coding (GC) scheme that utilizes dynamic clustering, denoted by GC-DC, to speed up gradient calculations. Under time-correlated straggling behavior, GC-DC aims at regulating the number of straggling workers in each cluster based on the straggler behavior in the previous iteration. We numerically show that GC-DC provides significant improvements in the average completion time (of each iteration) with no increase in the communication load compared to the original GC scheme.
Gradient Coding with Dynamic Clustering for Straggler Mitigation / Buyukates, B.; Ozfatura, E.; Ulukus, S.; Gunduz, D.. - (2021), pp. 1-6. (Intervento presentato al convegno 2021 IEEE International Conference on Communications, ICC 2021 tenutosi a can nel 2021) [10.1109/ICC42927.2021.9500346].
Gradient Coding with Dynamic Clustering for Straggler Mitigation
Gunduz D.
2021
Abstract
In distributed synchronous gradient descent (GD) the main performance bottleneck for the per-iteration completion time is the slowest straggling workers. To speed up GD iterations in the presence of stragglers, coded distributed computation techniques are implemented by assigning redundant computations to workers. In this paper, we propose a novel gradient coding (GC) scheme that utilizes dynamic clustering, denoted by GC-DC, to speed up gradient calculations. Under time-correlated straggling behavior, GC-DC aims at regulating the number of straggling workers in each cluster based on the straggler behavior in the previous iteration. We numerically show that GC-DC provides significant improvements in the average completion time (of each iteration) with no increase in the communication load compared to the original GC scheme.File | Dimensione | Formato | |
---|---|---|---|
Blind_Federated_Edge_Learning.pdf
Accesso riservato
Tipologia:
VOR - Versione pubblicata dall'editore
Dimensione
1.33 MB
Formato
Adobe PDF
|
1.33 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris