Regularized empirical risk minimization problems arise in a variety of applications, including machine learning, signal processing, and image processing. Proximal stochastic gradient algorithms are a standard approach to solve these problems due to their low computational cost per iteration and a relatively simple implementation. This paper introduces a class of proximal stochastic gradient methods built on three key elements: a variable metric underlying the iterations, a stochastic line search governing the decrease properties and an incremental mini-batch size technique based on additional sampling. Convergence results for the proposed algorithms are proved under different hypotheses on the function to minimize. No assumption is required regarding the Lipschitz continuity of the gradient of the differentiable part of the objective function. Possible strategies to automatically select the parameters of the suggested scheme are discussed. Numerical experiments on both binary classification and nonlinear regression problems show the effectiveness of the suggested approach compared to other state-of-the-art proximal stochastic gradient methods.
Variable metric proximal stochastic gradient methods with additional sampling / Krklec Jerinkić, N.; Porta, F.; Ruggiero, V.; Trombini, I.. - In: COMPUTATIONAL OPTIMIZATION AND APPLICATIONS. - ISSN 0926-6003. - (2025), pp. ...-.... [10.1007/s10589-025-00720-w]
Variable metric proximal stochastic gradient methods with additional sampling
Porta F.;Ruggiero V.;
2025
Abstract
Regularized empirical risk minimization problems arise in a variety of applications, including machine learning, signal processing, and image processing. Proximal stochastic gradient algorithms are a standard approach to solve these problems due to their low computational cost per iteration and a relatively simple implementation. This paper introduces a class of proximal stochastic gradient methods built on three key elements: a variable metric underlying the iterations, a stochastic line search governing the decrease properties and an incremental mini-batch size technique based on additional sampling. Convergence results for the proposed algorithms are proved under different hypotheses on the function to minimize. No assumption is required regarding the Lipschitz continuity of the gradient of the differentiable part of the objective function. Possible strategies to automatically select the parameters of the suggested scheme are discussed. Numerical experiments on both binary classification and nonlinear regression problems show the effectiveness of the suggested approach compared to other state-of-the-art proximal stochastic gradient methods.| File | Dimensione | Formato | |
|---|---|---|---|
|
unpaywall-bitstream-145246540.pdf
Open access
Tipologia:
VOR - Versione pubblicata dall'editore
Licenza:
[IR] creative-commons
Dimensione
10 MB
Formato
Adobe PDF
|
10 MB | Adobe PDF | Visualizza/Apri |
Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris




