In Papadakis et al. [1], we presented the latest release of JedAI, an open-source Entity Resolution (ER) system that allows for building a large variety of end-to-end ER pipelines. Through a thorough experimental evaluation, we compared a schema-agnostic ER pipeline based on blocks with another schema-based ER pipeline based on similarity joins. We applied them to 10 established, real-world datasets and assessed them with respect to effectiveness and time efficiency. Special care was taken to juxtapose their scalability, too, using seven established, synthetic datasets. Moreover, we experimentally compared the effectiveness of the batch schema-agnostic ER pipeline with its progressive counterpart. In this companion paper, we describe how to reproduce the entire experimental study that pertains to JedAI’s serial execution through its intuitive user interface. We also explain how to examine the robustness of the parameter configurations we have selected.

Reproducible experiments on Three-Dimensional Entity Resolution with JedAI / Mandilaras, George; Papadakis, George; Gagliardelli, Luca; Simonini, Giovanni; Thanos, Emmanouil; Giannakopoulos, George; Bergamaschi, Sonia; Palpanas, Themis; Koubarakis, Manolis; Lara-Clares, Alicia; Farina, Antonio. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 102:(2021), pp. 101830-101830. [10.1016/j.is.2021.101830]

Reproducible experiments on Three-Dimensional Entity Resolution with JedAI

Luca Gagliardelli;Giovanni Simonini
;
Sonia Bergamaschi;
2021

Abstract

In Papadakis et al. [1], we presented the latest release of JedAI, an open-source Entity Resolution (ER) system that allows for building a large variety of end-to-end ER pipelines. Through a thorough experimental evaluation, we compared a schema-agnostic ER pipeline based on blocks with another schema-based ER pipeline based on similarity joins. We applied them to 10 established, real-world datasets and assessed them with respect to effectiveness and time efficiency. Special care was taken to juxtapose their scalability, too, using seven established, synthetic datasets. Moreover, we experimentally compared the effectiveness of the batch schema-agnostic ER pipeline with its progressive counterpart. In this companion paper, we describe how to reproduce the entire experimental study that pertains to JedAI’s serial execution through its intuitive user interface. We also explain how to examine the robustness of the parameter configurations we have selected.
102
101830
101830
Reproducible experiments on Three-Dimensional Entity Resolution with JedAI / Mandilaras, George; Papadakis, George; Gagliardelli, Luca; Simonini, Giovanni; Thanos, Emmanouil; Giannakopoulos, George; Bergamaschi, Sonia; Palpanas, Themis; Koubarakis, Manolis; Lara-Clares, Alicia; Farina, Antonio. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 102:(2021), pp. 101830-101830. [10.1016/j.is.2021.101830]
Mandilaras, George; Papadakis, George; Gagliardelli, Luca; Simonini, Giovanni; Thanos, Emmanouil; Giannakopoulos, George; Bergamaschi, Sonia; Palpanas, Themis; Koubarakis, Manolis; Lara-Clares, Alicia; Farina, Antonio
File in questo prodotto:
File Dimensione Formato  
Reproducible experiments on Three-Dimensional Entity Resolution with JedAI.pdf

accesso aperto

Tipologia: Pre-print dell'autore (bozza pre referaggio)
Dimensione 1.16 MB
Formato Adobe PDF
1.16 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11380/1247511
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact