A stochastic gradient method with variance control and variable learning rate for Deep Learning

Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.

doi:10.1016/j.cam.2024.116083

In this paper we study a stochastic gradient algorithm which rules the increase of the minibatch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non -monotone line search procedure. The mini -batch size is incremented at a suitable a priori rate throughout the iterative process in order that the variance of the stochastic gradients is progressively reduced. The a priori rate is not subject to restrictive assumptions, allowing for the possibility of a slow increase in the mini -batch size. On the other hand, the learning rate can vary non -monotonically throughout the iterations, as long as it is appropriately bounded. Convergence results for the proposed method are provided for both convex and non -convex objective functions. Moreover it can be proved that the algorithm enjoys a global linear rate of convergence on strongly convex functions. The low per -iteration cost, the limited memory requirements and the robustness against the hyperparameters setting make the suggested approach well -suited for implementation within the deep learning framework, also for GPGPU-equipped architectures. Numerical results on training deep neural networks for multiclass image classification show a promising behaviour of the proposed scheme with respect to similar state of the art competitors.

A stochastic gradient method with variance control and variable learning rate for Deep Learning / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.. - In: JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS. - ISSN 0377-0427. - 451:(2024), pp. 116083-116083. [10.1016/j.cam.2024.116083]

A stochastic gradient method with variance control and variable learning rate for Deep Learning

Franchini G.;Porta F.;Ruggiero V.;Trombini I.;Zanni L.

2024

Abstract

In this paper we study a stochastic gradient algorithm which rules the increase of the minibatch size in a predefined fashion and automatically adjusts the learning rate by means of a monotone or non -monotone line search procedure. The mini -batch size is incremented at a suitable a priori rate throughout the iterative process in order that the variance of the stochastic gradients is progressively reduced. The a priori rate is not subject to restrictive assumptions, allowing for the possibility of a slow increase in the mini -batch size. On the other hand, the learning rate can vary non -monotonically throughout the iterations, as long as it is appropriately bounded. Convergence results for the proposed method are provided for both convex and non -convex objective functions. Moreover it can be proved that the algorithm enjoys a global linear rate of convergence on strongly convex functions. The low per -iteration cost, the limited memory requirements and the robustness against the hyperparameters setting make the suggested approach well -suited for implementation within the deep learning framework, also for GPGPU-equipped architectures. Numerical results on training deep neural networks for multiclass image classification show a promising behaviour of the proposed scheme with respect to similar state of the art competitors.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2024
			
	Rivista
	
				JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS
			
	N° del Volume
	
				451
			
	Pagina iniziale
	
				116083
			
	Pagina finale
	
				116083
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.cam.2024.116083
			
	Codice WoS
	
				WOS:001259636200001
			
	Codice Scopus
	
				2-s2.0-85196549690
			
	Citazione
	
				A stochastic gradient method with variance control and variable learning rate for Deep Learning / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.. - In: JOURNAL OF COMPUTATIONAL AND APPLIED MATHEMATICS. - ISSN 0377-0427. - 451:(2024), pp. 116083-116083. [10.1016/j.cam.2024.116083]
			
	Tutti gli autori
	
						Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.; Zanni, L.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0377042724003327-main.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] creative-commons Dimensione 717.12 kB Formato Adobe PDF Visualizza/Apri	717.12 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris