Searching Similar (Sub)Sentences for Example-Based Machine Translation

Mandreoli, Federica; Martoglia, Riccardo; Tiberio, Paolo

Translation is a repetitive activity. The attempt to automate such a difficult task has been a long-term scientific dream; in the past years research in this field has acquired a growing interest, making some forms of Machine Translation (MT) a reality. Among the several types of approaches in MT, one of the most promising paradigms is MAHT and, in particular, example-Based Machine Translation (EBMT). An EBMT system translates by analogy, using past translations to translate other, similar sourcelanguage sentences into the target language. The basic premise is that, if a previously translated sentence occurs again, the same translation is likely to be correct. In this paper, we propose a solution based on a purely syntactic approach for searching similar sentences and parts of them in an EBMT system; the underlying similarity measure is based on the similarity between sequence of terms such that the sentences most close to a given one are those who maintain most of the original form and contents. The system efficiently retrieves and ranks the most similar sentences available and, when no useful suggestion exists, it proceeds with the retrieval of similar parts. We opted for a design that would require minimal changes to existing databases and whose similarity measure and search algorithms are completely independent from the involved languages. This work has been developed as a joint work with LOGOS S.p.A., a worldwide leader in multilingual document translation.

Searching Similar (Sub)Sentences for Example-Based Machine Translation / Mandreoli, F., Martoglia, R., Tiberio, P.. - STAMPA. - (2002), pp. 208-221. (Decimo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2002) Portoferraio, Italy June 2002).

Searching Similar (Sub)Sentences for Example-Based Machine Translation

MANDREOLI, Federica;MARTOGLIA, Riccardo;TIBERIO, Paolo

2002

Abstract

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2002
			
	Titolo del Convegno
	
				Decimo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2002)
			
	Luogo del Convegno
	
				Portoferraio, Italy
			
	Data del Convegno
	
				June 2002
			
	Pagina iniziale
	
				208
			
	Pagina finale
	
				221
			
	Tutti gli autori
	
						Mandreoli, Federica; Martoglia, Riccardo; Tiberio, Paolo
					
	Citazione
	
				Searching Similar (Sub)Sentences for Example-Based Machine Translation / Mandreoli, F., Martoglia, R., Tiberio, P.. - STAMPA. - (2002), pp. 208-221. (Decimo Convegno Nazionale su Sistemi Evoluti per Basi di Dati (SEBD 2002) Portoferraio, Italy June 2002).
			
	Tipologia
	
				Relazione in Atti di Convegno

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris