Research on data integration has provided languages and systems able to guarantee an integrated intensional representation of a given set of data sources.A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge. In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values.Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources.
A new type of metadata for querying data integration systems / Bergamaschi, Sonia; Guerra, Francesco; Orsini, Mirko; C., Sartori. - STAMPA. - (2007), pp. 266-273. (Intervento presentato al convegno Convegno Nazionale Sistemi di Basi di Dati Evolute tenutosi a Torre Canne (Fasano, BR) nel 17-20 June 2007).
A new type of metadata for querying data integration systems
BERGAMASCHI, Sonia;GUERRA, Francesco;ORSINI, Mirko;
2007
Abstract
Research on data integration has provided languages and systems able to guarantee an integrated intensional representation of a given set of data sources.A significant limitation common to most proposals is that only intensional knowledge is considered, with little or no consideration for extensional knowledge. In this paper we propose a technique to enrich the intension of an attribute with a new sort of metadata: the “relevant values”, extracted from the attribute values.Relevant values enrich schemata with domain knowledge; moreover they can be exploited by a user in the interactive process of creating/refining a query. The technique, fully implemented in a prototype, is automatic, independent of the attribute domain and it is based on data mining clustering techniques and emerging semantics from data values. It is parametrized with various metrics for similarity measures and is a viable tool for dealing with frequently changing sources.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris