SCoRE: Streamlined corpus-based relation extraction using multi-label contrastive learning and Bayesian kNN

Mariotti, L.; Guidetti, V.; Mandreoli, F.

doi:10.1016/j.knosys.2025.115024

The growing demand for efficient knowledge graph (KG) enrichment leveraging external corpora has intensified interest in relation extraction (RE), particularly under low-supervision settings. To address the need for adaptable and noise-resilient RE solutions that integrate seamlessly with pre-trained large language models (PLMs), we introduce SCoRE, a modular and cost-effective sentence-level RE system. SCoRE enables easy PLM switching, requires no finetuning, and adapts smoothly to diverse corpora and KGs. By combining supervised contrastive learning with a Bayesian k-Nearest Neighbors (kNN) classifier for multi-label classification, it delivers robust performance despite the noisy annotations of distantly supervised corpora. To improve RE evaluation, we propose two novel metrics: Correlatin Structure Distance (CSD), measuring the alignment between learned relational patterns and KG structures, and Precision at R (P@R), assessing utility as a recommender system. We also release Wiki20d, a benchmark dataset replicating real-world RE conditions where only KG-derived annotations are available. Experiments on five benchmarks demonstrate that SCoRE matches or slightly surpasses state-of-the-art methods (average gains of +3.2 in micro-F1 and +5.9 in macro-F1 against fully reproducible baselines), while reducing the training burden by more than an order of magnitude (approximate to 99% lower energy consumption in kWh) Further analyses reveal that increasing model complexity, as seen in prior work, degrades performance, highlighting the advantages of SCoRE's minimal design. Combining efficiency, modularity, and scalability, SCoRE stands as an optimal choice for real-world RE applications.

SCoRE: Streamlined corpus-based relation extraction using multi-label contrastive learning and Bayesian kNN / Mariotti, L., Guidetti, V., Mandreoli, F.. - In: KNOWLEDGE-BASED SYSTEMS. - ISSN 0950-7051. - 333:(2026), pp. 1-13. [10.1016/j.knosys.2025.115024]

SCoRE: Streamlined corpus-based relation extraction using multi-label contrastive learning and Bayesian kNN

Mariotti L.;Guidetti V.;Mandreoli F.

2026

Abstract

The growing demand for efficient knowledge graph (KG) enrichment leveraging external corpora has intensified interest in relation extraction (RE), particularly under low-supervision settings. To address the need for adaptable and noise-resilient RE solutions that integrate seamlessly with pre-trained large language models (PLMs), we introduce SCoRE, a modular and cost-effective sentence-level RE system. SCoRE enables easy PLM switching, requires no finetuning, and adapts smoothly to diverse corpora and KGs. By combining supervised contrastive learning with a Bayesian k-Nearest Neighbors (kNN) classifier for multi-label classification, it delivers robust performance despite the noisy annotations of distantly supervised corpora. To improve RE evaluation, we propose two novel metrics: Correlatin Structure Distance (CSD), measuring the alignment between learned relational patterns and KG structures, and Precision at R (P@R), assessing utility as a recommender system. We also release Wiki20d, a benchmark dataset replicating real-world RE conditions where only KG-derived annotations are available. Experiments on five benchmarks demonstrate that SCoRE matches or slightly surpasses state-of-the-art methods (average gains of +3.2 in micro-F1 and +5.9 in macro-F1 against fully reproducible baselines), while reducing the training burden by more than an order of magnitude (approximate to 99% lower energy consumption in kWh) Further analyses reveal that increasing model complexity, as seen in prior work, degrades performance, highlighting the advantages of SCoRE's minimal design. Combining efficiency, modularity, and scalability, SCoRE stands as an optimal choice for real-world RE applications.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2026
			
	Rivista
	
				KNOWLEDGE-BASED SYSTEMS
			
	N° del Volume
	
				333
			
	Pagina iniziale
	
				1
			
	Pagina finale
	
				13
			
	Codice DOI
	
				https://dx.doi.org/10.1016/j.knosys.2025.115024
			
	Codice WoS
	
				WOS:001639053200002
			
	Codice Scopus
	
				2-s2.0-105024212690
			
	Citazione
	
				SCoRE: Streamlined corpus-based relation extraction using multi-label contrastive learning and Bayesian kNN / Mariotti, L., Guidetti, V., Mandreoli, F.. - In: KNOWLEDGE-BASED SYSTEMS. - ISSN 0950-7051. - 333:(2026), pp. 1-13. [10.1016/j.knosys.2025.115024]
			
	Tutti gli autori
	
						Mariotti, L.; Guidetti, V.; Mandreoli, F.
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
1-s2.0-S0950705125020623-main.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Licenza: [IR] creative-commons Dimensione 1.92 MB Formato Adobe PDF Visualizza/Apri	1.92 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris