Automatic generation of probabilistic relationships for improving schema matching

Po, Laura; Sorrentino, Serena

doi:10.1016/j.is.2010.09.004

Schema matching is the problem of finding relationships among concepts across data sources that are heterogeneous in format and in structure. Starting from the ‘‘hidden meaning’’ associated with schema labels (i.e.class/attribute names), it is possible to discover lexical relationships among the elements of different schemata. In this work, we propose an automatic method aimed at discovering probabilistic lexical relationships in the environment of data integration ‘‘on the fly’’. Our method is based on a probabilistic lexical annotation technique, which automatically associates one or more meanings with schema elements w.r.t. a thesaurus/ lexical resource. However, the accuracy of automatic lexical annotation methods on real-world schemata suffers from the abundance of non-dictionary words such as compound nouns and abbreviations.We address this problem by including a method to perform schema label normalization which increases the number of comparable labels. From the annotated schemata, we derive the probabilistic lexical relationships to be collected in the Probabilistic CommonThesaurus. The method is applied within the MOMIS data integration system but can easily be generalized to other data integration systems.

Automatic generation of probabilistic relationships for improving schema matching / Po, L., Sorrentino, S.. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - STAMPA. - 36:2(2011), pp. 192-208. [10.1016/j.is.2010.09.004]

Automatic generation of probabilistic relationships for improving schema matching

PO, Laura;SORRENTINO, Serena

2011

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2011
			
	Rivista
	
				INFORMATION SYSTEMS
			
	N° del Volume
	
				36
			
	Fascicolo
	
				2
			
	Pagina iniziale
	
				192
			
	Pagina finale
	
				208
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.is.2010.09.004
			
	Codice WoS
	
				WOS:000285366700006
			
	Codice Scopus
	
				2-s2.0-78649489023
			
	Citazione
	
				Automatic generation of probabilistic relationships for improving schema matching / Po, L., Sorrentino, S.. - In: INFORMATION SYSTEMS. - ISSN 0306-4379. - STAMPA. - 36:2(2011), pp. 192-208. [10.1016/j.is.2010.09.004]
			
	Tutti gli autori
	
						Po, Laura; Sorrentino, Serena
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Elsevier-Automatic generation of probabilistic relationships for improving schema matching.pdf Accesso riservato Tipologia: VOR - Versione pubblicata dall'editore Dimensione 1.01 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	1.01 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris