Ritz-like values in steplength selections for stochastic gradient methods

Franchini, G.; Ruggiero, V.; Zanni, L.

doi:10.1007/s00500-020-05219-6

The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods for large-scale optimization problems arising in machine learning. In a recent paper, Bollapragada et al. (SIAM J Optim 28(4):3312–3343, 2018) propose to include an adaptive subsampling strategy into a stochastic gradient scheme, with the aim to assure the descent feature in expectation of the stochastic gradient directions. In this approach, theoretical convergence properties are preserved under the assumption that the positive steplength satisfies at any iteration a suitable bound depending on the inverse of the Lipschitz constant of the objective function gradient. In this paper, we propose to tailor for the stochastic gradient scheme the steplength selection adopted in the full-gradient method knows as limited memory steepest descent method. This strategy, based on the Ritz-like values of a suitable matrix, enables to give a local estimate of the inverse of the local Lipschitz parameter, without introducing line search techniques, while the possible increase in the size of the subsample used to compute the stochastic gradient enables to control the variance of this direction. An extensive numerical experimentation highlights that the new rule makes the tuning of the parameters less expensive than the trial procedure for the efficient selection of a constant step in standard and mini-batch stochastic gradient methods.

Ritz-like values in steplength selections for stochastic gradient methods / Franchini, G.; Ruggiero, V.; Zanni, L.. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:23(2020), pp. 17573-17588. [10.1007/s00500-020-05219-6]

Ritz-like values in steplength selections for stochastic gradient methods

Franchini G.;Ruggiero V.;Zanni L.

2020

Abstract

The steplength selection is a crucial issue for the effectiveness of the stochastic gradient methods for large-scale optimization problems arising in machine learning. In a recent paper, Bollapragada et al. (SIAM J Optim 28(4):3312–3343, 2018) propose to include an adaptive subsampling strategy into a stochastic gradient scheme, with the aim to assure the descent feature in expectation of the stochastic gradient directions. In this approach, theoretical convergence properties are preserved under the assumption that the positive steplength satisfies at any iteration a suitable bound depending on the inverse of the Lipschitz constant of the objective function gradient. In this paper, we propose to tailor for the stochastic gradient scheme the steplength selection adopted in the full-gradient method knows as limited memory steepest descent method. This strategy, based on the Ritz-like values of a suitable matrix, enables to give a local estimate of the inverse of the local Lipschitz parameter, without introducing line search techniques, while the possible increase in the size of the subsample used to compute the stochastic gradient enables to control the variance of this direction. An extensive numerical experimentation highlights that the new rule makes the tuning of the parameters less expensive than the trial procedure for the efficient selection of a constant step in standard and mini-batch stochastic gradient methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Rivista
	
				SOFT COMPUTING
			
	N° del Volume
	
				24
			
	Fascicolo
	
				23
			
	Pagina iniziale
	
				17573
			
	Pagina finale
	
				17588
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s00500-020-05219-6
			
	Codice WoS
	
				WOS:000560637900004
			
	Codice Scopus
	
				2-s2.0-85089533012
			
	Citazione
	
				Ritz-like values in steplength selections for stochastic gradient methods / Franchini, G.; Ruggiero, V.; Zanni, L.. - In: SOFT COMPUTING. - ISSN 1432-7643. - 24:23(2020), pp. 17573-17588. [10.1007/s00500-020-05219-6]
			
	Tutti gli autori
	
						Franchini, G.; Ruggiero, V.; Zanni, L.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
s00500-020-05219-6.pdf Accesso riservato Tipologia: VOR - Versione pubblicata dall'editore Dimensione 1.72 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.72 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris