The rapid digitization of cultural heritage has underscored the critical need for robust digital libraries, particularly for underrepresented languages like Arabic and Persian. This paper describes the methodologies and challenges involved in developing a metadata-driven Arabic digital library, utilizing bibliographic metadata extracted from the Diamond catalogue. It explores advanced metadata schemas, such as Dublin Core, and integrates text recognition technologies and preservation strategies to address key concerns of accessibility, scholarly use, and the long-term preservation of Arabic-script texts. The paper delves into specific challenges of processing Arabic script, including handling calligraphy, diacritics, and ligatures, and introduces innovative solutions like the use of frontispiece images to train OCR systems. Furthermore, it discusses how integrated metadata could not only enhance text recognition but also improve user engagement by enabling refined search functionalities and better resource discovery. Finally, the paper outlines future directions for expanding metadata frameworks to ensure interoperability and the long-term preservation of cultural heritage.

Digital Maktaba Project: Proposing a Metadata-Driven Framework for Arabic Library Digitization / EL GANADI, Amina; Gagliardelli, Luca; Aftar, Sania; Ruozzi, Federico. - Vol-3937:(2025). (Intervento presentato al convegno IRCDL 2025: 21st Conference on Information and Research Sciences Connecting to Digital and Library Science tenutosi a Udine, Italy nel February, 20-21 2025).

Digital Maktaba Project: Proposing a Metadata-Driven Framework for Arabic Library Digitization

Amina El Ganadi
Writing – Original Draft Preparation
;
Luca Gagliardelli
Writing – Review & Editing
;
Sania Aftar
Membro del Collaboration Group
;
Federico Ruozzi
Supervision
2025

Abstract

The rapid digitization of cultural heritage has underscored the critical need for robust digital libraries, particularly for underrepresented languages like Arabic and Persian. This paper describes the methodologies and challenges involved in developing a metadata-driven Arabic digital library, utilizing bibliographic metadata extracted from the Diamond catalogue. It explores advanced metadata schemas, such as Dublin Core, and integrates text recognition technologies and preservation strategies to address key concerns of accessibility, scholarly use, and the long-term preservation of Arabic-script texts. The paper delves into specific challenges of processing Arabic script, including handling calligraphy, diacritics, and ligatures, and introduces innovative solutions like the use of frontispiece images to train OCR systems. Furthermore, it discusses how integrated metadata could not only enhance text recognition but also improve user engagement by enabling refined search functionalities and better resource discovery. Finally, the paper outlines future directions for expanding metadata frameworks to ensure interoperability and the long-term preservation of cultural heritage.
2025
IRCDL 2025: 21st Conference on Information and Research Sciences Connecting to Digital and Library Science
Udine, Italy
February, 20-21 2025
Vol-3937
EL GANADI, Amina; Gagliardelli, Luca; Aftar, Sania; Ruozzi, Federico
Digital Maktaba Project: Proposing a Metadata-Driven Framework for Arabic Library Digitization / EL GANADI, Amina; Gagliardelli, Luca; Aftar, Sania; Ruozzi, Federico. - Vol-3937:(2025). (Intervento presentato al convegno IRCDL 2025: 21st Conference on Information and Research Sciences Connecting to Digital and Library Science tenutosi a Udine, Italy nel February, 20-21 2025).
File in questo prodotto:
File Dimensione Formato  
short13.pdf

Open access

Tipologia: VOR - Versione pubblicata dall'editore
Dimensione 2.83 MB
Formato Adobe PDF
2.83 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1374928
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact