Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits

Egger, M.; Bitar, R.; Wachter-Zeh, A.; Gunduz, D.

doi:10.1109/ISIT50566.2022.9834499

We consider the distributed stochastic gradient descent problem, where a main node distributes gradient calculations among n workers from which at most b ≤ n can be utilized in parallel. By assigning tasks to all the workers and waiting only for the k fastest ones, the main node can trade-off the error of the algorithm with its runtime by gradually increasing k as the algorithm evolves. However, this strategy, referred to as adaptive k-sync, can incur additional costs since it ignores the computational efforts of slow workers. We propose a cost-efficient scheme that assigns tasks only to k workers and gradually increases k. As the response times of the available workers are unknown to the main node a priori, we utilize a combinatorial multi-armed bandit model to learn which workers are the fastest while assigning gradient calculations, and to minimize the effect of slow workers. Assuming that the mean response times of the workers are independent and exponentially distributed with different means, we give empirical and theoretical guarantees on the regret of our strategy, i.e., the extra time spent to learn the mean response times of the workers. Compared to adaptive k-sync, our scheme achieves significantly lower errors with the same computational efforts while being inferior in terms of speed.

Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits / Egger, M.; Bitar, R.; Wachter-Zeh, A.; Gunduz, D.. - 2022-:(2022), pp. 1653-1658. (Intervento presentato al convegno 2022 IEEE International Symposium on Information Theory, ISIT 2022 tenutosi a fin nel 2022) [10.1109/ISIT50566.2022.9834499].

Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits

Egger M.;Bitar R.;Wachter-Zeh A.;Gunduz D.

2022

Abstract

We consider the distributed stochastic gradient descent problem, where a main node distributes gradient calculations among n workers from which at most b ≤ n can be utilized in parallel. By assigning tasks to all the workers and waiting only for the k fastest ones, the main node can trade-off the error of the algorithm with its runtime by gradually increasing k as the algorithm evolves. However, this strategy, referred to as adaptive k-sync, can incur additional costs since it ignores the computational efforts of slow workers. We propose a cost-efficient scheme that assigns tasks only to k workers and gradually increases k. As the response times of the available workers are unknown to the main node a priori, we utilize a combinatorial multi-armed bandit model to learn which workers are the fastest while assigning gradient calculations, and to minimize the effect of slow workers. Assuming that the mean response times of the workers are independent and exponentially distributed with different means, we give empirical and theoretical guarantees on the regret of our strategy, i.e., the extra time spent to learn the mean response times of the workers. Compared to adaptive k-sync, our scheme achieves significantly lower errors with the same computational efforts while being inferior in terms of speed.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2022
			
	Titolo del Convegno
	
				2022 IEEE International Symposium on Information Theory, ISIT 2022
			
	Luogo del Convegno
	
				fin
			
	Data del Convegno
	
				2022
			
	Codice DOI
	
				https://dx.doi.org/10.1109/ISIT50566.2022.9834499
			
	Codice Scopus
	
				2-s2.0-85136260894
			
	N° del Volume
	
				2022-
			
	Pagina iniziale
	
				1653
			
	Pagina finale
	
				1658
			
	Tutti gli autori
	
						Egger, M.; Bitar, R.; Wachter-Zeh, A.; Gunduz, D.
					
	Citazione
	
				Efficient Distributed Machine Learning via Combinatorial Multi-Armed Bandits / Egger, M.; Bitar, R.; Wachter-Zeh, A.; Gunduz, D.. - 2022-:(2022), pp. 1653-1658. (Intervento presentato al  convegno 2022 IEEE International Symposium on Information Theory, ISIT 2022 tenutosi a fin nel 2022) [10.1109/ISIT50566.2022.9834499].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
Efficient_Distributed_Machine_Learning_via_Combinatorial_Multi-Armed_Bandits.pdf Accesso riservato Tipologia: Versione pubblicata dall'editore Dimensione 2.04 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.04 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris