A Line Search Based Proximal Stochastic Gradient Algorithm with Dynamical Variance Reduction

Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.

doi:10.1007/s10915-022-02084-3

Many optimization problems arising from machine learning applications can be cast as the minimization of the sum of two functions: the first one typically represents the expected risk, and in practice it is replaced by the empirical risk, and the other one imposes a priori information on the solution. Since in general the first term is differentiable and the second one is convex, proximal gradient methods are very well suited to face such optimization problems. However, when dealing with large-scale machine learning issues, the computation of the full gradient of the differentiable term can be prohibitively expensive by making these algorithms unsuitable. For this reason, proximal stochastic gradient methods have been extensively studied in the optimization area in the last decades. In this paper we develop a proximal stochastic gradient algorithm which is based on two main ingredients. We indeed combine a proper technique to dynamically reduce the variance of the stochastic gradients along the iterative process with a descent condition in expectation for the objective function, aimed to fix the value for the steplength parameter at each iteration. For general objective functionals, the a.s. convergence of the limit points of the sequence generated by the proposed scheme to stationary points can be proved. For convex objective functionals, both the a.s. convergence of the whole sequence of the iterates to a minimum point and an O(1 / k) convergence rate for the objective function values have been shown. The practical implementation of the proposed method does not need neither the computation of the exact gradient of the empirical risk during the iterations nor the tuning of an optimal value for the steplength. An extensive numerical experimentation highlights that the proposed approach appears robust with respect to the setting of the hyperparameters and competitive compared to state-of-the-art methods.

A Line Search Based Proximal Stochastic Gradient Algorithm with Dynamical Variance Reduction / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.. - In: JOURNAL OF SCIENTIFIC COMPUTING. - ISSN 0885-7474. - 94:1(2023), pp. 23-23. [10.1007/s10915-022-02084-3]

A Line Search Based Proximal Stochastic Gradient Algorithm with Dynamical Variance Reduction

Franchini G.;Porta F.;Ruggiero V.;Trombini I.

2023

Abstract

Many optimization problems arising from machine learning applications can be cast as the minimization of the sum of two functions: the first one typically represents the expected risk, and in practice it is replaced by the empirical risk, and the other one imposes a priori information on the solution. Since in general the first term is differentiable and the second one is convex, proximal gradient methods are very well suited to face such optimization problems. However, when dealing with large-scale machine learning issues, the computation of the full gradient of the differentiable term can be prohibitively expensive by making these algorithms unsuitable. For this reason, proximal stochastic gradient methods have been extensively studied in the optimization area in the last decades. In this paper we develop a proximal stochastic gradient algorithm which is based on two main ingredients. We indeed combine a proper technique to dynamically reduce the variance of the stochastic gradients along the iterative process with a descent condition in expectation for the objective function, aimed to fix the value for the steplength parameter at each iteration. For general objective functionals, the a.s. convergence of the limit points of the sequence generated by the proposed scheme to stationary points can be proved. For convex objective functionals, both the a.s. convergence of the whole sequence of the iterates to a minimum point and an O(1 / k) convergence rate for the objective function values have been shown. The practical implementation of the proposed method does not need neither the computation of the exact gradient of the empirical risk during the iterations nor the tuning of an optimal value for the steplength. An extensive numerical experimentation highlights that the proposed approach appears robust with respect to the setting of the hyperparameters and competitive compared to state-of-the-art methods.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Rivista
	
				JOURNAL OF SCIENTIFIC COMPUTING
			
	N° del Volume
	
				94
			
	Fascicolo
	
				1
			
	Pagina iniziale
	
				23
			
	Pagina finale
	
				23
			
	Codice DOI
	
				https://dx.doi.org/10.1007/s10915-022-02084-3
			
	Codice WoS
	
				WOS:000903449900002
			
	Codice Scopus
	
				2-s2.0-85144629632
			
	Citazione
	
				A Line Search Based Proximal Stochastic Gradient Algorithm with Dynamical Variance Reduction / Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.. - In: JOURNAL OF SCIENTIFIC COMPUTING. - ISSN 0885-7474. - 94:1(2023), pp. 23-23. [10.1007/s10915-022-02084-3]
			
	Tutti gli autori
	
						Franchini, G.; Porta, F.; Ruggiero, V.; Trombini, I.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris