Steplength and Mini-batch Size Selection in Stochastic Gradient Methods

Franchini, G.; Ruggiero, V.; Zanni, L.

doi:10.1007/978-3-030-64580-9_22

The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods for large-scale optimization problems arising in machine learning. In a recent paper, Bollapragada et al. [1] propose to include an adaptive subsampling strategy into a stochastic gradient scheme. We propose to combine this approach with a selection rule for the steplength, borrowed from the full-gradient scheme known as Limited Memory Steepest Descent (LMSD) method [4] and suitably tailored to the stochastic framework. This strategy, based on the Ritz-like values of a suitable matrix, enables to give a local estimate of the local Lipschitz constant of the gradient of the objective function, without introducing line-search techniques, while the possible increase of the subsample size used to compute the stochastic gradient enables to control the variance of this direction. An extensive numerical experimentation for convex and non-convex loss functions highlights that the new rule makes the tuning of the parameters less expensive than the selection of a suitable constant steplength in standard and mini-batch stochastic gradient methods. The proposed procedure has also been compared with the Momentum and ADAM methods.

Steplength and Mini-batch Size Selection in Stochastic Gradient Methods / Franchini, G.; Ruggiero, V.; Zanni, L.. - 12566:(2020), pp. 259-263. (Intervento presentato al convegno 6th International Conference on Machine Learning, Optimization, and Data Science, LOD 2020 tenutosi a ita nel 2020) [10.1007/978-3-030-64580-9_22].

Steplength and Mini-batch Size Selection in Stochastic Gradient Methods

Franchini G.;Ruggiero V.;Zanni L.

2020

Abstract

The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods for large-scale optimization problems arising in machine learning. In a recent paper, Bollapragada et al. [1] propose to include an adaptive subsampling strategy into a stochastic gradient scheme. We propose to combine this approach with a selection rule for the steplength, borrowed from the full-gradient scheme known as Limited Memory Steepest Descent (LMSD) method [4] and suitably tailored to the stochastic framework. This strategy, based on the Ritz-like values of a suitable matrix, enables to give a local estimate of the local Lipschitz constant of the gradient of the objective function, without introducing line-search techniques, while the possible increase of the subsample size used to compute the stochastic gradient enables to control the variance of this direction. An extensive numerical experimentation for convex and non-convex loss functions highlights that the new rule makes the tuning of the parameters less expensive than the selection of a suitable constant steplength in standard and mini-batch stochastic gradient methods. The proposed procedure has also been compared with the Momentum and ADAM methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Titolo del Convegno
	
				6th International Conference on Machine Learning, Optimization, and Data Science, LOD 2020
			
	Luogo del Convegno
	
				ita
			
	Data del Convegno
	
				2020
			
	Codice DOI
	
				https://dx.doi.org/10.1007/978-3-030-64580-9_22
			
	Codice Scopus
	
				2-s2.0-85101325732
			
	Serie
	
				LECTURE NOTES IN ARTIFICIAL INTELLIGENCE
			
	N° del Volume
	
				12566
			
	Pagina iniziale
	
				259
			
	Pagina finale
	
				263
			
	Tutti gli autori
	
						Franchini, G.; Ruggiero, V.; Zanni, L.
					
	Citazione
	
				Steplength and Mini-batch Size Selection in Stochastic Gradient Methods / Franchini, G.; Ruggiero, V.; Zanni, L.. - 12566:(2020), pp. 259-263. (Intervento presentato al  convegno 6th International Conference on Machine Learning, Optimization, and Data Science, LOD 2020 tenutosi a ita nel 2020) [10.1007/978-3-030-64580-9_22].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris