Online sources produce a huge amount of textual data, i.e., freeform text. To derive insightful information from them and facilitate the application of Machine Learning algorithms textual data need to be processed and structured. Knowledge Graphs (KGs) are intelligent systems for the analysis of documents. In recent years, they have been adopted in multiple contexts, including text mining for the development of data-driven solutions to different problems. The scope of this paper is to provide a methodology to build KGs from textual data and apply algorithms to group similar documents in communities. The methodology exploits semantic and statistical approaches to extract relevant insights from each document; these data are then organized in a KG that allows for their interconnection. The methodology has been successfully tested on news articles related to crime events occurred in the city of Modena, in Italy. The promising results demonstrate how KG-based analysis can improve the management of information coming from online sources.
Knowledge Graphs for Community Detection in Textual Data / Rollo, F.; Po, L.. - 1686:(2022), pp. 201-215. (Intervento presentato al convegno 4th Iberoamerican and the 3rd Indo-American Knowledge Graphs and Semantic Web Conference, KGSWC 2022 tenutosi a esp nel 2022) [10.1007/978-3-031-21422-6_15].
Knowledge Graphs for Community Detection in Textual Data
Rollo F.;Po L.
2022
Abstract
Online sources produce a huge amount of textual data, i.e., freeform text. To derive insightful information from them and facilitate the application of Machine Learning algorithms textual data need to be processed and structured. Knowledge Graphs (KGs) are intelligent systems for the analysis of documents. In recent years, they have been adopted in multiple contexts, including text mining for the development of data-driven solutions to different problems. The scope of this paper is to provide a methodology to build KGs from textual data and apply algorithms to group similar documents in communities. The methodology exploits semantic and statistical approaches to extract relevant insights from each document; these data are then organized in a KG that allows for their interconnection. The methodology has been successfully tested on news articles related to crime events occurred in the city of Modena, in Italy. The promising results demonstrate how KG-based analysis can improve the management of information coming from online sources.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris