Data–driven Semiotics and Semiotics–driven Machine Learning

Sanna, Leonardo

doi:10.4399/97888255354266

Nowadays there is a huge and growing variety of digital data. Despite the obvious relevance for the humanities and the social sciences, these massive quantities of data, usually defined as “big data”, are mainly selected and ana- lyzed using computer science and statistics. The paper proposes a theoretical and practical approach to the analysis of large quantities of data within the field of semiotic analysis. The main claim is that semiotics should dialogue with IT and statistics, that are essential to deal with the vastness and continuous variability of data. In particular, machine learning might become really useful from a semiotic perspective. In this work, we use a machine learning technique that is used in Natural Language Processing (NLP), to create a vector space based on probabilities of co–occurrences of words. In a distributional semantics perspective, this space is interpreted as a representation of semantic relations among words. We present then two directions in which we could intend the joint effort of semiotics and machine learning. In the first case, we propose a case study of semiotics–driven machine learning, in which we create a dataset starting from a semiotic analysis. In the second case, we present an example of data–driven semiotics, were the semiotic tools are used on an existing dataset, that was not build with semiotic scopes. The two directions have not to be intended as a dichotomy but instead as a part of a joint effort where semiotics interacts with machine learning and machine learning interacts with qualitative analysis.

Data–driven Semiotics and Semiotics–driven Machine Learning / Sanna, L.. - In: LEXIA. - ISSN 1720-5298. - 2020:33-34(2020), pp. 89-107. [10.4399/97888255354266]

Data–driven Semiotics and Semiotics–driven Machine Learning

Leonardo Sanna^{Writing – Review & Editing}

2020

Abstract

Nowadays there is a huge and growing variety of digital data. Despite the obvious relevance for the humanities and the social sciences, these massive quantities of data, usually defined as “big data”, are mainly selected and ana- lyzed using computer science and statistics. The paper proposes a theoretical and practical approach to the analysis of large quantities of data within the field of semiotic analysis. The main claim is that semiotics should dialogue with IT and statistics, that are essential to deal with the vastness and continuous variability of data. In particular, machine learning might become really useful from a semiotic perspective. In this work, we use a machine learning technique that is used in Natural Language Processing (NLP), to create a vector space based on probabilities of co–occurrences of words. In a distributional semantics perspective, this space is interpreted as a representation of semantic relations among words. We present then two directions in which we could intend the joint effort of semiotics and machine learning. In the first case, we propose a case study of semiotics–driven machine learning, in which we create a dataset starting from a semiotic analysis. In the second case, we present an example of data–driven semiotics, were the semiotic tools are used on an existing dataset, that was not build with semiotic scopes. The two directions have not to be intended as a dichotomy but instead as a part of a joint effort where semiotics interacts with machine learning and machine learning interacts with qualitative analysis.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2020
			
	Data di prima pubblicazione
	
				giu-2020
			
	Rivista
	
				LEXIA
			
	N° del Volume
	
				2020
			
	Fascicolo
	
				33-34
			
	Pagina iniziale
	
				89
			
	Pagina finale
	
				107
			
	Codice DOI
	
				https://dx.doi.org/10.4399/97888255354266
			
	Codice Scopus
	
				2-s2.0-85180064079
			
	Citazione
	
				Data–driven Semiotics and Semiotics–driven Machine Learning / Sanna, L.. - In: LEXIA. - ISSN 1720-5298. - 2020:33-34(2020), pp. 89-107. [10.4399/97888255354266]
			
	Tutti gli autori
	
						Sanna, Leonardo
					
	Tipologia
	
				Articolo su rivista

File in questo prodotto:

File	Dimensione	Formato
Data–driven Semiotics and Semiotics–driven Machine Learning.pdf Accesso riservato Tipologia: VOR - Versione pubblicata dall'editore Dimensione 2.93 MB Formato Adobe PDF Visualizza/Apri Richiedi una copia	2.93 MB	Adobe PDF	Visualizza/Apri Richiedi una copia

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris