We propose a novel knowledge-based technique for inter-document similarity computation, called Context Semantic Analysis (CSA). Several specialized approaches built on top of specific knowledge base (e.g. Wikipedia) exist in literature, but CSA differs from them because it is designed to be portable to any RDF knowledge base. Our technique relies on a generic RDF knowledge base (e.g. DBpedia and Wikidata) to extract from it a contextual graph and a semantic contextual vector able to represent the context of a document. We show how CSA exploits such Semantic Context Vector to compute inter-document similarity effectively. Moreover, we show how CSA can be effectively applied in the Information Retrieval domain. Experimental results show that our general technique outperforms baselines built on top of traditional methods, and achieves a performance similar to the ones built on top of specific knowledge bases.

Computing inter-document similarity with Context Semantic Analysis / Beneventano, Domenico; Benedetti, Fabio; Bergamaschi, Sonia; Simonini, Giovanni. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 80:(2019), pp. 136-147. [10.1016/j.is.2018.02.009]

Computing inter-document similarity with Context Semantic Analysis

Domenico Beneventano
;
Fabio Benedetti;Sonia Bergamaschi;Giovanni Simonini
2019

Abstract

We propose a novel knowledge-based technique for inter-document similarity computation, called Context Semantic Analysis (CSA). Several specialized approaches built on top of specific knowledge base (e.g. Wikipedia) exist in literature, but CSA differs from them because it is designed to be portable to any RDF knowledge base. Our technique relies on a generic RDF knowledge base (e.g. DBpedia and Wikidata) to extract from it a contextual graph and a semantic contextual vector able to represent the context of a document. We show how CSA exploits such Semantic Context Vector to compute inter-document similarity effectively. Moreover, we show how CSA can be effectively applied in the Information Retrieval domain. Experimental results show that our general technique outperforms baselines built on top of traditional methods, and achieves a performance similar to the ones built on top of specific knowledge bases.
19-feb-2018
80
136
147
Computing inter-document similarity with Context Semantic Analysis / Beneventano, Domenico; Benedetti, Fabio; Bergamaschi, Sonia; Simonini, Giovanni. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - 80:(2019), pp. 136-147. [10.1016/j.is.2018.02.009]
Beneventano, Domenico; Benedetti, Fabio; Bergamaschi, Sonia; Simonini, Giovanni
File in questo prodotto:
File Dimensione Formato  
beneventano2018.pdf

non disponibili

Tipologia: Versione dell'editore (versione pubblicata)
Dimensione 2.17 MB
Formato Adobe PDF
2.17 MB Adobe PDF   Visualizza/Apri   Richiedi una copia
POSTPRINT_j.is.2018.02.009.pdf

accesso aperto

Tipologia: Post-print dell'autore (bozza post referaggio)
Dimensione 1.66 MB
Formato Adobe PDF
1.66 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Caricamento pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: http://hdl.handle.net/11380/1154386
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 41
  • ???jsp.display-item.citation.isi??? 26
social impact