Managing and sharing cultural heritages also in supranational and multi-literate contexts is a very hot research topic. In this paper we discuss the research we are conducting in the DigitalMaktaba project, presenting the first steps for designing an innovative workflow and tool for the automatic extraction of knowledge from documents written in multiple non-Latin languages (Arabic, Persian and Azerbaijani languages). The tool leverages different OCR, text processing techniques and linguistic corpora in order to provide both a highly accurate extracted text and a rich metadata content, overcoming typical limitations of current state-of-the-art systems; this will enable in the near future the development of an automatic cataloguer which we hope will ultimately help in better preserving and conserving culture in such a demanding scenario.

Preserving and conserving culture: First steps towards a knowledge extractor and cataloguer for multilingual and multi-alphabetic heritages / Bergamaschi, S.; Martoglia, R.; Ruozzi, F.; Vigliermo, R. A.; De Nardis, S.; Sala, L.; Vanzini, M.. - (2021), pp. 301-304. (Intervento presentato al convegno 1st Conference on Information Technology for Social Good, GoodIT 2021 tenutosi a ita nel 2021) [10.1145/3462203.3475927].

Preserving and conserving culture: First steps towards a knowledge extractor and cataloguer for multilingual and multi-alphabetic heritages

Bergamaschi S.;Martoglia R.;Ruozzi F.;Vigliermo R. A.;Sala L.;Vanzini M.
2021

Abstract

Managing and sharing cultural heritages also in supranational and multi-literate contexts is a very hot research topic. In this paper we discuss the research we are conducting in the DigitalMaktaba project, presenting the first steps for designing an innovative workflow and tool for the automatic extraction of knowledge from documents written in multiple non-Latin languages (Arabic, Persian and Azerbaijani languages). The tool leverages different OCR, text processing techniques and linguistic corpora in order to provide both a highly accurate extracted text and a rich metadata content, overcoming typical limitations of current state-of-the-art systems; this will enable in the near future the development of an automatic cataloguer which we hope will ultimately help in better preserving and conserving culture in such a demanding scenario.
2021
1st Conference on Information Technology for Social Good, GoodIT 2021
ita
2021
301
304
Bergamaschi, S.; Martoglia, R.; Ruozzi, F.; Vigliermo, R. A.; De Nardis, S.; Sala, L.; Vanzini, M.
Preserving and conserving culture: First steps towards a knowledge extractor and cataloguer for multilingual and multi-alphabetic heritages / Bergamaschi, S.; Martoglia, R.; Ruozzi, F.; Vigliermo, R. A.; De Nardis, S.; Sala, L.; Vanzini, M.. - (2021), pp. 301-304. (Intervento presentato al convegno 1st Conference on Information Technology for Social Good, GoodIT 2021 tenutosi a ita nel 2021) [10.1145/3462203.3475927].
File in questo prodotto:
Non ci sono file associati a questo prodotto.
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1255162
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 6
  • ???jsp.display-item.citation.isi??? ND
social impact