Schema Label Normalization for Improving Schema Matching

Sorrentino, Serena; Bergamaschi, Sonia; Gawinecki, Maciej; Po, Laura

doi:10.1016/j.datak.2010.10.004

Schema matching is the problem of finding relationships among concepts across heterogeneous data sources that are heterogeneous in format and in structure. Starting from the “hidden meaning” associated with schema labels (i.e. class/attribute names) it is possible to discover relationships among the elements of different schemata. Lexical annotation (i.e. annotation w.r.t. a thesaurus/lexical resource) helps in associating a “meaning” to schema labels.However, the performance of semi-automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns, abbreviations, and acronyms. We address this problem by proposing a method to perform schema label normalization which increases the number of comparable labels. The method semi-automatically expands abbreviations/acronyms and annotates compound nouns, with minimal manual effort. We empirically prove that our normalization method helps in the identification of similarities among schema elements of different data sources, thus improving schema matching results.

Schema Label Normalization for Improving Schema Matching / Sorrentino, S., Bergamaschi, S., Gawinecki, M., Po, L.. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - STAMPA. - 69:12(2010), pp. 1254-1273. [10.1016/j.datak.2010.10.004]

Schema Label Normalization for Improving Schema Matching

SORRENTINO, Serena;BERGAMASCHI, Sonia;GAWINECKI, MacieJ;PO, Laura

2010

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2010
			
	Rivista
	
				DATA & KNOWLEDGE ENGINEERING
			
	N° del Volume
	
				69
			
	Fascicolo
	
				12
			
	Pagina iniziale
	
				1254
			
	Pagina finale
	
				1273
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.datak.2010.10.004
			
	Codice WoS
	
				WOS:000285862800003
			
	Codice Scopus
	
				2-s2.0-78649721325
			
	Citazione
	
				Schema Label Normalization for Improving Schema Matching / Sorrentino, S., Bergamaschi, S., Gawinecki, M., Po, L.. - In: DATA & KNOWLEDGE ENGINEERING. - ISSN 0169-023X. - STAMPA. - 69:12(2010), pp. 1254-1273. [10.1016/j.datak.2010.10.004]
			
	Tutti gli autori
	
						Sorrentino, Serena; Bergamaschi, Sonia; Gawinecki, Maciej; Po, Laura
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
DKE2010.pdf Accesso riservato Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] closed Dimensione 1.53 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.53 MB	Adobe PDF	Visualizza/Apri Richiedi una copia
DKE_2010_POSTPRINT.pdf Open access Tipologia: AAM - Versione dell'autore revisionata e accettata per la pubblicazione Dimensione 625.75 kB Formato Adobe PDF Visualizza/Apri	625.75 kB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris