Scores are mathematical combinations of elementary indicators (EIs) widely used to measure complex phenomena. Upon the theoretical framework definition, score construction requires a method to aggregate EIs. Aggregation is usually chosen among known methodologies fixing its shape through a try and error approach. Only then are the predictive power, the distribution of the index, and its ability to stratify the population measured. In this paper, we propose a novel data-driven approach that generates analytic aggregation methods relying on multi-objective symbolic regression. We translate the properties that the index must exhibit into optimization goals so that optimal index candidates replicate target variables, data balancing, and stratification. We run experiments on real data sets to solve three main score management problems: data-driven score simplification, generation, and combination. The results obtained show the effectiveness and robustness of the proposed approach.

Multi-Objective Symbolic Regression for Data-Driven Scoring System Management / Ferrari, D.; Guidetti, V.; Mandreoli, F.. - 2022-:(2022), pp. 945-950. (Intervento presentato al convegno 22nd IEEE International Conference on Data Mining, ICDM 2022 tenutosi a usa nel 2022) [10.1109/ICDM54844.2022.00112].

Multi-Objective Symbolic Regression for Data-Driven Scoring System Management

Ferrari D.;Mandreoli F.
2022

Abstract

Scores are mathematical combinations of elementary indicators (EIs) widely used to measure complex phenomena. Upon the theoretical framework definition, score construction requires a method to aggregate EIs. Aggregation is usually chosen among known methodologies fixing its shape through a try and error approach. Only then are the predictive power, the distribution of the index, and its ability to stratify the population measured. In this paper, we propose a novel data-driven approach that generates analytic aggregation methods relying on multi-objective symbolic regression. We translate the properties that the index must exhibit into optimization goals so that optimal index candidates replicate target variables, data balancing, and stratification. We run experiments on real data sets to solve three main score management problems: data-driven score simplification, generation, and combination. The results obtained show the effectiveness and robustness of the proposed approach.
2022
22nd IEEE International Conference on Data Mining, ICDM 2022
usa
2022
2022-
945
950
Ferrari, D.; Guidetti, V.; Mandreoli, F.
Multi-Objective Symbolic Regression for Data-Driven Scoring System Management / Ferrari, D.; Guidetti, V.; Mandreoli, F.. - 2022-:(2022), pp. 945-950. (Intervento presentato al convegno 22nd IEEE International Conference on Data Mining, ICDM 2022 tenutosi a usa nel 2022) [10.1109/ICDM54844.2022.00112].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1297547
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 4
  • ???jsp.display-item.citation.isi??? 1
social impact