This paper proposes a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is based on probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. to perform annotation w.r.t. a thesaurus/lexical resource) the schemata of a given set of data sources to be integrated. From the annotated schemata and the relathionships defined in the thesaurus, we derived the probabilistic lexical relationships among schema elements. Lexical relationships are collected in the Probabilistic Common Thesaurus (PCT), as well as structural relationships.
Uncertainty in data integration systems: automatic generation of probabilistic relationships / Bergamaschi, Sonia; Po, Laura; Sorrentino, Serena; Corni, Alberto. - STAMPA. - (2010), pp. 221-228. (Intervento presentato al convegno 6th Conference of the Italian Chapter of the Association for Information Systems, ItAIS 2009 tenutosi a Costa Smeralda , Italy nel 2-3 Ottobre 2009) [10.1007/978-3-7908-2404-9_26].
Uncertainty in data integration systems: automatic generation of probabilistic relationships
BERGAMASCHI, Sonia;PO, Laura;SORRENTINO, Serena;CORNI, Alberto
2010
Abstract
This paper proposes a method for the automatic discovery of probabilistic relationships in the environment of data integration systems. Dynamic data integration systems extend the architecture of current data integration systems by modeling uncertainty at their core. Our method is based on probabilistic word sense disambiguation (PWSD), which allows to automatically lexically annotate (i.e. to perform annotation w.r.t. a thesaurus/lexical resource) the schemata of a given set of data sources to be integrated. From the annotated schemata and the relathionships defined in the thesaurus, we derived the probabilistic lexical relationships among schema elements. Lexical relationships are collected in the Probabilistic Common Thesaurus (PCT), as well as structural relationships.File | Dimensione | Formato | |
---|---|---|---|
PWSD_Chapter_SpringerBook_POSTPRINT.pdf
Open access
Dimensione
106.5 kB
Formato
Unknown
|
106.5 kB | Unknown | Visualizza/Apri |
Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris