MOMIS: Getting through the THALIA benchmark

Beneventano, Domenico; Bergamaschi, Sonia; Orsini, Mirko; Vincini, Maurizio

During the last decade many data integration systems characterized by a classical wrapper/mediator architecture based on a Global Virtual Schema (Global Virtual View - GVV) have been proposed. The data sources store data, while the GVV provides a reconciled, integrated, and virtual view of the underlying sources. Each proposed system contribute to the state of the art advancement by focusing on different aspects to provide an answer to one or more challenges of the data integration problem, ranging from system-level heterogeneities, to structural syntax level heterogeneities at the semantic level. The approaches are still in part manual, requiring a great amount of customization for data reconciliation and for writing specific non reusable programming code. The specialization of mediator systems make a comparisons among the various systems difficult. Therefore, the last Lowell Report [1] has provided the guideline for the definition of a public benchmark for data integration problems. The proposal is called THALIA (Test Harness for the Assessment of Legacy information Integration Approaches) [2], and it provides researchers with a collection of downloadable data sources representing University course catalogues, a set of twelve benchmark queries, as well as a scoring function for ranking the performance of an integration system. In this paper we show how the MOMIS mediator system we developed [3,4] can deal with all the twelve queries of the THALIA benchmark by simply extending and combining the declarative translation functions available in MOMIS and without any overhead of new code. This is a remarkable result, in fact, as far as we know, no system has provided a complete answer to the benchmark.

MOMIS: Getting through the THALIA benchmark / Beneventano, Domenico; Bergamaschi, Sonia; Orsini, Mirko; Vincini, Maurizio. - STAMPA. - (2010), pp. 354-357. ( 18th Italian Symposiun on Advanced Database System (SEBD 2010) Rimini June, 20-23, 2010).