The production of machine-readable data in the form of RDF datasets belonging to the Linked Open Data (LOD) Cloud is growing very fast. However, selecting relevant knowledge sources from the Cloud, assessing the quality and extracting synthetical information from a LOD source are all tasks that require a strong human effort. This paper proposes an approach for the automatic extraction of the more representative information from a LOD source and the creation of a set of indexes that enhance the description of the dataset. These indexes collect statistical information regarding the size and the complexity of the dataset (e.g. the number of instances), but also depict all the instantiated classes and the properties among them, supplying user with a synthetical view of the LOD source. The technique is fully implemented in LODeX, a tool able to deal with the performance issues of systems that expose SPARQL endpoints and to cope with the heterogeneity on the knowledge representation of RDF data. An evaluation on LODeX on a large number of endpoints (244) belonging to the LOD Cloud has been performed and the effectiveness of the index extraction process has been presented.

Online Index Extraction from Linked Open Data Sources / Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura. - ELETTRONICO. - 1267:(2014), pp. 9-20. (Intervento presentato al convegno Second International Workshop on Linked Data for Information Extraction (LD4IE} 2014) tenutosi a Riva del Garda, Italy nel October 20, 2014).

Online Index Extraction from Linked Open Data Sources

BENEDETTI, FABIO;BERGAMASCHI, Sonia;PO, Laura
2014

Abstract

The production of machine-readable data in the form of RDF datasets belonging to the Linked Open Data (LOD) Cloud is growing very fast. However, selecting relevant knowledge sources from the Cloud, assessing the quality and extracting synthetical information from a LOD source are all tasks that require a strong human effort. This paper proposes an approach for the automatic extraction of the more representative information from a LOD source and the creation of a set of indexes that enhance the description of the dataset. These indexes collect statistical information regarding the size and the complexity of the dataset (e.g. the number of instances), but also depict all the instantiated classes and the properties among them, supplying user with a synthetical view of the LOD source. The technique is fully implemented in LODeX, a tool able to deal with the performance issues of systems that expose SPARQL endpoints and to cope with the heterogeneity on the knowledge representation of RDF data. An evaluation on LODeX on a large number of endpoints (244) belonging to the LOD Cloud has been performed and the effectiveness of the index extraction process has been presented.
2014
Second International Workshop on Linked Data for Information Extraction (LD4IE} 2014)
Riva del Garda, Italy
October 20, 2014
1267
9
20
Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura
Online Index Extraction from Linked Open Data Sources / Benedetti, Fabio; Bergamaschi, Sonia; Po, Laura. - ELETTRONICO. - 1267:(2014), pp. 9-20. (Intervento presentato al convegno Second International Workshop on Linked Data for Information Extraction (LD4IE} 2014) tenutosi a Riva del Garda, Italy nel October 20, 2014).
File in questo prodotto:
File Dimensione Formato  
ceur-ws-org-Vol-1267-LD4IE2014_Benedetti.pdf

Open access

Tipologia: Versione pubblicata dall'editore
Dimensione 328.78 kB
Formato Adobe PDF
328.78 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1048518
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 25
  • ???jsp.display-item.citation.isi??? ND
social impact