Symbolic Regression (SR) is an interpretable machine learning technique that discovers closed-form expressions from data, increasingly adopted in scientific and industrial applications. However, its use in privacy-sensitive domains has remained limited, as conventional SR methods require centralized data access for parameter estimation and model selection. We propose Bayesian Federated Symbolic Regression (BFSR), the first framework enabling SR in horizontal federated learning scenarios with formal Bayesian grounding. BFSR formulates federated SR as a sequential process that jointly performs marginal likelihood-based model selection and distributed Bayesian parameter inference, allowing clients to refine local posteriors without sharing data. We instantiate this framework via a two-stage strategy built upon genetic programming: during the evolutionary algorithm, we estimate model quality via the global Bayesian information criterion computed through Gaussian posterior fusion under a large-sample approximation; in the second stage, we apply full sequential Bayesian inference to a subset of candidate models for principled uncertainty quantification and refined marginal likelihood estimation. Empirical evaluations on six datasets under varying levels of data heterogeneity and different numbers of clients show that BFSR consistently outperforms existing federated baselines in predictive accuracy and model interpretability, also enabling uncertainty quantification. These results establish BFSR as a scalable and trustworthy solution for federated modeling, wellsuited to high-stakes domains requiring transparency and reliable epistemic uncertainty estimates.

Mining Trustworthy Symbolic Regression Models in Federated Settings / Billa, M., Guidetti, V., La Rocca, L., Mandreoli, F.. - (2025), pp. 1055-1064. (25th IEEE International Conference on Data Mining, ICDM 2025 Washington DC, USA 2025) [10.1109/icdm65498.2025.00114].

Mining Trustworthy Symbolic Regression Models in Federated Settings

Billa, Mattia
;
Guidetti, Veronica;La Rocca, Luca;Mandreoli, Federica
2025

Abstract

Symbolic Regression (SR) is an interpretable machine learning technique that discovers closed-form expressions from data, increasingly adopted in scientific and industrial applications. However, its use in privacy-sensitive domains has remained limited, as conventional SR methods require centralized data access for parameter estimation and model selection. We propose Bayesian Federated Symbolic Regression (BFSR), the first framework enabling SR in horizontal federated learning scenarios with formal Bayesian grounding. BFSR formulates federated SR as a sequential process that jointly performs marginal likelihood-based model selection and distributed Bayesian parameter inference, allowing clients to refine local posteriors without sharing data. We instantiate this framework via a two-stage strategy built upon genetic programming: during the evolutionary algorithm, we estimate model quality via the global Bayesian information criterion computed through Gaussian posterior fusion under a large-sample approximation; in the second stage, we apply full sequential Bayesian inference to a subset of candidate models for principled uncertainty quantification and refined marginal likelihood estimation. Empirical evaluations on six datasets under varying levels of data heterogeneity and different numbers of clients show that BFSR consistently outperforms existing federated baselines in predictive accuracy and model interpretability, also enabling uncertainty quantification. These results establish BFSR as a scalable and trustworthy solution for federated modeling, wellsuited to high-stakes domains requiring transparency and reliable epistemic uncertainty estimates.
2025
25th IEEE International Conference on Data Mining, ICDM 2025
Washington DC, USA
2025
1055
1064
Billa, Mattia; Guidetti, Veronica; La Rocca, Luca; Mandreoli, Federica
Mining Trustworthy Symbolic Regression Models in Federated Settings / Billa, M., Guidetti, V., La Rocca, L., Mandreoli, F.. - (2025), pp. 1055-1064. (25th IEEE International Conference on Data Mining, ICDM 2025 Washington DC, USA 2025) [10.1109/icdm65498.2025.00114].
File in questo prodotto:
File Dimensione Formato  
Mining_Trustworthy_Symbolic_Regression_Models_in_Federated_Settings.pdf

Open access

Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 510.54 kB
Formato Adobe PDF
510.54 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1411111
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact