Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a coded distributed computing system, where multiple masters, each with a different matrix multiplication task, assign computation tasks to workers with heterogeneous computing capabilities. Both dedicated and probabilistic worker assignment models are considered, with the objective of minimizing the average completion time of all tasks. For dedicated worker assignment, greedy algorithms are proposed and the corresponding optimal load allocation is derived based on the Lagrange multiplier method. For probabilistic assignment, successive convex approximation method is used to solve the non-convex optimization problem. Simulation results show that the proposed algorithms reduce the completion time by 80% over uncoded scheme, and 49% over an unbalanced coded scheme.

Heterogeneous coded computation across heterogeneous workers / Sun, Y.; Zhao, J.; Zhou, S.; Gunduz, D.. - (2019), pp. 1-6. (Intervento presentato al convegno 2019 IEEE Global Communications Conference, GLOBECOM 2019 tenutosi a Hilton Waikoloa Village Resort, usa nel 2019) [10.1109/GLOBECOM38437.2019.9014006].

Heterogeneous coded computation across heterogeneous workers

D. Gunduz
2019

Abstract

Coded distributed computing framework enables large-scale machine learning (ML) models to be trained efficiently in a distributed manner, while mitigating the straggler effect. In this work, we consider a multi-task assignment problem in a coded distributed computing system, where multiple masters, each with a different matrix multiplication task, assign computation tasks to workers with heterogeneous computing capabilities. Both dedicated and probabilistic worker assignment models are considered, with the objective of minimizing the average completion time of all tasks. For dedicated worker assignment, greedy algorithms are proposed and the corresponding optimal load allocation is derived based on the Lagrange multiplier method. For probabilistic assignment, successive convex approximation method is used to solve the non-convex optimization problem. Simulation results show that the proposed algorithms reduce the completion time by 80% over uncoded scheme, and 49% over an unbalanced coded scheme.
2019
2019 IEEE Global Communications Conference, GLOBECOM 2019
Hilton Waikoloa Village Resort, usa
2019
1
6
Sun, Y.; Zhao, J.; Zhou, S.; Gunduz, D.
Heterogeneous coded computation across heterogeneous workers / Sun, Y.; Zhao, J.; Zhou, S.; Gunduz, D.. - (2019), pp. 1-6. (Intervento presentato al convegno 2019 IEEE Global Communications Conference, GLOBECOM 2019 tenutosi a Hilton Waikoloa Village Resort, usa nel 2019) [10.1109/GLOBECOM38437.2019.9014006].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1202630
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 12
  • ???jsp.display-item.citation.isi??? 0
social impact