A fundamental task in data integration is data fusion, the process of fusing multiple recordsrepresenting the same real-world object into a consistent representation; data fusion involves theresolution of possible conflicts between data coming from different sources; several high levelstrategies to handle inconsistent data have been described and classified in [8].The MOMIS Data Integration System [2] uses either conflict avoiding strategies (such as the trustyour friends strategy which takes the value of a preferred source) and resolution strategies (suchas the meet in the middle strategy which takes an average value).In this paper we consider other strategies proposed in literature to handle inconsistent data andwe discuss how they can be adopted and extended in the MOMIS Data Integration System. First of all,we consider the methods introduced by the Trio system [1,6] and based on the idea to tackle dataconflicts by explicitly including information on provenance to represent uncertainty and use it toanswer queries. Other possible strategies are to ignore conflicting values at the global level(i.e., only consistent values are considered) and to consider at the global level all conflictingvalues.The original contribution of this paper is a provenance-based framework which includes all the abovementioned conflict handling strategies and use them as different search strategies for querying theintegrated sources.
Provenance Based Conflict Handling Strategies / Beneventano, Domenico. - STAMPA. - 7240:(2012), pp. 286-297. (Intervento presentato al convegno 17th International Conference on Database Systems for Advanced Applications, DASFAA 2012 tenutosi a Busan, kor nel April 15-19) [10.1007/978-3-642-29023-7_29].
Provenance Based Conflict Handling Strategies
BENEVENTANO, Domenico
2012
Abstract
A fundamental task in data integration is data fusion, the process of fusing multiple recordsrepresenting the same real-world object into a consistent representation; data fusion involves theresolution of possible conflicts between data coming from different sources; several high levelstrategies to handle inconsistent data have been described and classified in [8].The MOMIS Data Integration System [2] uses either conflict avoiding strategies (such as the trustyour friends strategy which takes the value of a preferred source) and resolution strategies (suchas the meet in the middle strategy which takes an average value).In this paper we consider other strategies proposed in literature to handle inconsistent data andwe discuss how they can be adopted and extended in the MOMIS Data Integration System. First of all,we consider the methods introduced by the Trio system [1,6] and based on the idea to tackle dataconflicts by explicitly including information on provenance to represent uncertainty and use it toanswer queries. Other possible strategies are to ignore conflicting values at the global level(i.e., only consistent values are considered) and to consider at the global level all conflictingvalues.The original contribution of this paper is a provenance-based framework which includes all the abovementioned conflict handling strategies and use them as different search strategies for querying theintegrated sources.Pubblicazioni consigliate
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris