Attention in Natural Language Processing

Galassi, Andrea; Lippi, Marco; Torroni, Paolo

doi:10.1109/TNNLS.2020.3019893

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Attention in Natural Language Processing / Galassi, Andrea; Lippi, Marco; Torroni, Paolo. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]

Attention in Natural Language Processing

Galassi, Andrea;Lippi, Marco;Torroni, Paolo

2021

Abstract

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Rivista
	
				IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
			
	N° del Volume
	
				32
			
	Fascicolo
	
				10
			
	Pagina iniziale
	
				4291
			
	Pagina finale
	
				4308
			
	Codice DOI
	
				https://dx.doi.org/10.1109/TNNLS.2020.3019893
			
	Codice WoS
	
				WOS:000704111000006
			
	Codice Scopus
	
				2-s2.0-85117224349
			
	Codice PubMed
	
				32915750
			
	Citazione
	
				Attention in Natural Language Processing / Galassi, Andrea; Lippi, Marco; Torroni, Paolo. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]
			
	Tutti gli autori
	
						Galassi, Andrea; Lippi, Marco; Torroni, Paolo
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
TNNLS2020.pdf Open access Tipologia: VOR - Versione pubblicata dall'editore Dimensione 2.73 MB Formato Adobe PDF Visualizza/Apri	2.73 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris