An Intrinsically Interpretable Entity Matching System

Baraldi, A.; Del Buono, F.; Guerra, F.; Paganelli, M.; Vincini, M.

doi:10.48786/edbt.2023.54

Explainable classification systems generate predictions along with a weight for each term in the input record measuring its contribution to the prediction. In the entity matching (EM) scenario, inputs are pairs of entity descriptions and the resulting explanations can be difficult to understand for the users. They can be very long and assign different impacts to similar terms located in different descriptions. To address these issues, we introduce the concept of decision units, i.e., basic information units formed either by pairs of (similar) terms, each one belonging to a different entity description, or unique terms, existing in one of the descriptions only. Decision units form a new feature space, able to represent, in a compact and meaningful way, pairs of entity descriptions. An explainable model trained on such features generates effective explanations customized for EM datasets. In this paper, we propose this idea via a three-component architecture template, which consists of a decision unit generator, a decision unit scorer, and an explainable matcher. Then, we introduce WYM (Why do You Match?), an implementation of the architecture oriented to textual EM databases. The experiments show that our approach has accuracy comparable to other state-of-the-art Deep Learning based EM models, but, differently from them, its predictions are highly interpretable.

An Intrinsically Interpretable Entity Matching System / Baraldi, A.; Del Buono, F.; Guerra, F.; Paganelli, M.; Vincini, M.. - 26:3(2023), pp. 645-657. ( 26th International Conference on Extending Database Technology, EDBT 2023 Ioannina, Greece March 28 - March 31) [10.48786/edbt.2023.54].

An Intrinsically Interpretable Entity Matching System

Baraldi A.;Del Buono F.;Guerra F.;Paganelli M.;Vincini M.

2023

Abstract

Explainable classification systems generate predictions along with a weight for each term in the input record measuring its contribution to the prediction. In the entity matching (EM) scenario, inputs are pairs of entity descriptions and the resulting explanations can be difficult to understand for the users. They can be very long and assign different impacts to similar terms located in different descriptions. To address these issues, we introduce the concept of decision units, i.e., basic information units formed either by pairs of (similar) terms, each one belonging to a different entity description, or unique terms, existing in one of the descriptions only. Decision units form a new feature space, able to represent, in a compact and meaningful way, pairs of entity descriptions. An explainable model trained on such features generates effective explanations customized for EM datasets. In this paper, we propose this idea via a three-component architecture template, which consists of a decision unit generator, a decision unit scorer, and an explainable matcher. Then, we introduce WYM (Why do You Match?), an implementation of the architecture oriented to textual EM databases. The experiments show that our approach has accuracy comparable to other state-of-the-art Deep Learning based EM models, but, differently from them, its predictions are highly interpretable.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2023
			
	Titolo del Convegno
	
				26th International Conference on Extending Database Technology, EDBT 2023
			
	Luogo del Convegno
	
				Ioannina, Greece
			
	Data del Convegno
	
				March 28 - March 31
			
	Codice DOI
	
				https://dx.doi.org/10.48786/edbt.2023.54
			
	Codice Scopus
	
				2-s2.0-85164988748
			
	N° del Volume
	
				26
			
	Pagina iniziale
	
				645
			
	Pagina finale
	
				657
			
	Tutti gli autori
	
						Baraldi, A.; Del Buono, F.; Guerra, F.; Paganelli, M.; Vincini, M.
					
	Citazione
	
				An Intrinsically Interpretable Entity Matching System / Baraldi, A.; Del Buono, F.; Guerra, F.; Paganelli, M.; Vincini, M.. - 26:3(2023), pp. 645-657. ( 26th International Conference on Extending Database Technology, EDBT 2023 Ioannina, Greece March 28 - March 31) [10.48786/edbt.2023.54].
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

File	Dimensione	Formato
An Intrinsically Interpretable Entity Matching System.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Dimensione 3.91 MB Formato Adobe PDF Visualizza/Apri	3.91 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris