Learning rate selection in stochastic gradient methods based on line search strategies

Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum problems is the stochastic gradient method. It is well known that a proper strategy to select the hyperparameters of this method (i.e. the set of a-priori selected parameters) and, in particular, the learning rate, is needed to guarantee convergence properties and good practical performance. In this paper, we analyse standard and line search based updating rules to fix the learning rate sequence, also in relation to the size of the mini batch chosen to compute the current stochastic gradient. An extensive numerical experimentation is carried out in order to evaluate the effectiveness of the discussed strategies for convex and non-convex finite-sum test problems, highlighting that the line search based methods avoid expensive initial setting of the hyperparameters. The line search based approaches have also been applied to train a Convolutional Neural Network, providing very promising results.

Learning rate selection in stochastic gradient methods based on line search strategies / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.. - In: APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING. - ISSN 2769-0911. - 31:1(2023), pp. 2164000-2164000. [10.1080/27690911.2022.2164000]

Learning rate selection in stochastic gradient methods based on line search strategies

Franchini G.;Porta F.;Ruggiero V.;Trombini I.;Zanni L.

2023

Abstract

Finite-sum problems appear as the sample average approximation of a stochastic optimization problem and often arise in machine learning applications with large scale data sets. A very popular approach to face finite-sum problems is the stochastic gradient method. It is well known that a proper strategy to select the hyperparameters of this method (i.e. the set of a-priori selected parameters) and, in particular, the learning rate, is needed to guarantee convergence properties and good practical performance. In this paper, we analyse standard and line search based updating rules to fix the learning rate sequence, also in relation to the size of the mini batch chosen to compute the current stochastic gradient. An extensive numerical experimentation is carried out in order to evaluate the effectiveness of the discussed strategies for convex and non-convex finite-sum test problems, highlighting that the line search based methods avoid expensive initial setting of the hyperparameters. The line search based approaches have also been applied to train a Convolutional Neural Network, providing very promising results.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Rivista
	
				APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING
			
	N° del Volume
	
				31
			
	Fascicolo
	
				1
			
	Pagina iniziale
	
				2164000
			
	Pagina finale
	
				2164000
			
	Codice DOI
	
				https://dx.doi.org/10.1080/27690911.2022.2164000
			
	Codice WoS
	
				WOS:000910928500001
			
	Codice Scopus
	
				2-s2.0-85146781895
			
	Citazione
	
				Learning rate selection in stochastic gradient methods based on line search strategies / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.. - In: APPLIED MATHEMATICS IN SCIENCE AND ENGINEERING. - ISSN 2769-0911. - 31:1(2023), pp. 2164000-2164000. [10.1080/27690911.2022.2164000]
			
	Tutti gli autori
	
						Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Learning rate selection in stochastic gradient methods based on line search strategies.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Dimensione 2.76 MB Formato Adobe PDF Visualizza/Apri	2.76 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1306506

Citazioni

ND

3

3

social impact