The problem of identifying the manifold generated copies of an object is known as Object Identification (OI). This problem concerns the quality of the data. Subsequently, the quality of the ob- ject (data) could be restored through the identification of the corrupted copies.In literature the solutions are mainly oriented to discover pairs of du- plicates (pairs-oriented OI) rather than sets of similar objects (group- oriented OI). We proposed a new technique to resolve the OI problem among many sources in a quasi-decentralized manner. The new technique is based on the concept of constraints and is composed by two phases: extraction phase and grouping. First we extract constraints by analyz- ing data at hand (the decentralized phase). Then, we reason about those to find the groups of similar objects (the centralized phase). We have conducted several tests that show the effectiveness of our proposal.
Object Identification across Multiple Sources / Beneventano, Domenico; Matteo Di, Gioia; Monica, Scannapieco. - STAMPA. - (2010), pp. 414-425. (Intervento presentato al convegno Eighteenth Italian Symposium on Advanced Database Systems, SEBD 2010 tenutosi a Rimini, Italy nel June 20-23 - 2010).