In relational databases, the full disjunction operator is an associative extension of the full outerjoin to an arbitrary number of relations. Its goal is to maximize the information we can extract from a database by connecting all tables through all join paths. The use of full disjunctions has been envisaged in several scenarios, such as data integration, and knowledge extraction. One of the main limitations in its adoption in real business scenarios is the large time its computation requires. This paper overcomes this limitation by introducing a novel approach parafd, based on parallel computing techniques, for implementing the full disjunction operator in an exact and approximate version. Our proposal has been compared with state of the art algorithms, which have also been re-implemented for performing in parallel. The experiments show that the time performance outperforms existing approaches. Finally, we have experimented the full disjunction as a collection of documents indexed by a textual search engine. In this way, we provide a simple technique for performing keyword search over relational databases. The results obtained against a benchmark show high precision and recall levels even compared with the existing proposals.

Parallelizing computations of full disjunctions / Paganelli, Matteo; Beneventano, Domenico; Guerra, Francesco; Sottovia, Paolo. - In: BIG DATA RESEARCH. - ISSN 2214-5796. - 17:(2019), pp. 18-31. [10.1016/j.bdr.2019.07.002]

Parallelizing computations of full disjunctions

Paganelli, Matteo;Beneventano, Domenico;Guerra, Francesco
;
Sottovia, Paolo
2019

Abstract

In relational databases, the full disjunction operator is an associative extension of the full outerjoin to an arbitrary number of relations. Its goal is to maximize the information we can extract from a database by connecting all tables through all join paths. The use of full disjunctions has been envisaged in several scenarios, such as data integration, and knowledge extraction. One of the main limitations in its adoption in real business scenarios is the large time its computation requires. This paper overcomes this limitation by introducing a novel approach parafd, based on parallel computing techniques, for implementing the full disjunction operator in an exact and approximate version. Our proposal has been compared with state of the art algorithms, which have also been re-implemented for performing in parallel. The experiments show that the time performance outperforms existing approaches. Finally, we have experimented the full disjunction as a collection of documents indexed by a textual search engine. In this way, we provide a simple technique for performing keyword search over relational databases. The results obtained against a benchmark show high precision and recall levels even compared with the existing proposals.
17
18
31
Parallelizing computations of full disjunctions / Paganelli, Matteo; Beneventano, Domenico; Guerra, Francesco; Sottovia, Paolo. - In: BIG DATA RESEARCH. - ISSN 2214-5796. - 17:(2019), pp. 18-31. [10.1016/j.bdr.2019.07.002]
Paganelli, Matteo; Beneventano, Domenico; Guerra, Francesco; Sottovia, Paolo
File in questo prodotto:
File Dimensione Formato  
Parallelizing computations of full disjunction.pdf

non disponibili

Tipologia: Versione dell'editore (versione pubblicata)
Dimensione 1.29 MB
Formato Adobe PDF
1.29 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
Parallelizing_Computations_of_Full_Disjunctions__Camera_Ready_ (1).pdf

accesso aperto

Tipologia: Post-print dell'autore (bozza post referaggio)
Dimensione 760.35 kB
Formato Adobe PDF
760.35 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11380/1179096
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 0
  • ???jsp.display-item.citation.isi??? 0
social impact