Attention in Natural Language Processing

Galassi, Andrea; Lippi, Marco; Torroni, Paolo

doi:10.1109/TNNLS.2020.3019893

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Attention in Natural Language Processing / Galassi, Andrea; Lippi, Marco; Torroni, Paolo. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]

Attention in Natural Language Processing

Galassi, Andrea;Lippi, Marco;Torroni, Paolo

2021

Abstract

Attention is an increasingly popular mechanism used in a wide range of neural architectures. The mechanism itself has been realized in a variety of formats. However, because of the fast-paced advances in this domain, a systematic overview of attention is still missing. In this article, we define a unified model for attention architectures in natural language processing, with a focus on those designed to work with vector representations of the textual data. We propose a taxonomy of attention models according to four dimensions: the representation of the input, the compatibility function, the distribution function, and the multiplicity of the input and/or output. We present the examples of how prior information can be exploited in attention models and discuss ongoing research efforts and open challenges in the area, providing the first extensive categorization of the vast body of literature in this exciting domain.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
			2021
		
	Rivista
	
			IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS
		
	N° del Volume
	
			32
		
	Fascicolo
	
			10
		
	Pagina iniziale
	
			4291
		
	Pagina finale
	
			4308
		
	Codice DOI
	
			https://dx.doi.org/10.1109/TNNLS.2020.3019893
		
	Codice WoS
	
			WOS:000704111000006
		
	Codice Scopus
	
			2-s2.0-85117224349
		
	Codice PubMed
	
			32915750
		
	Citazione
	
			Attention in Natural Language Processing / Galassi, Andrea; Lippi, Marco; Torroni, Paolo. - In: IEEE TRANSACTIONS ON NEURAL NETWORKS AND LEARNING SYSTEMS. - ISSN 2162-237X. - 32:10(2021), pp. 4291-4308. [10.1109/TNNLS.2020.3019893]
		
	Tutti gli autori
	
			Galassi, Andrea; Lippi, Marco; Torroni, Paolo
		
	Tipologia
	
			Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
TNNLS2020.pdf Open access Tipologia: Versione pubblicata dall'editore Dimensione 2.73 MB Formato Adobe PDF Visualizza/Apri	2.73 MB	Adobe PDF	Visualizza/Apri

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris