The idea behind this work stems from the participation in some shared tasks concerning stance detection in NLP conferences. In these competitions, participants tried to develop the best stance prediction system for 'favor', 'against', and 'none' categories on selected topics, according to messages and relationships among users of a social networking site. Thus, the data available consisted of textual and network data. The teams we collaborated with used dimensionality reduction methods for network data, through a Multidimensional Scaling. On the other hand, the approach towards textual data involved different methods of feature extraction, without paying particular attention to dimensionality reduction for unstructured data. In this paper we show the empirical results of a two-step strategy to obtain lower-dimensional textual data relying on text mining techniques and principal component analysis. The results show levels of accuracy comparable to classical feature extraction techniques and to the best task models, despite using a much smaller number of predictors.

Dimensionality Reduction of Unstructured and Network Data for Stance Detection / Sciandra, A.. - 2:(2022), pp. 801-808. (Intervento presentato al convegno JADT 2022 - 16th International Conference on Statistical Analysis of Textual Data tenutosi a Napoli (Italy) nel 06-08 July, 2022).

Dimensionality Reduction of Unstructured and Network Data for Stance Detection

Sciandra A.
2022

Abstract

The idea behind this work stems from the participation in some shared tasks concerning stance detection in NLP conferences. In these competitions, participants tried to develop the best stance prediction system for 'favor', 'against', and 'none' categories on selected topics, according to messages and relationships among users of a social networking site. Thus, the data available consisted of textual and network data. The teams we collaborated with used dimensionality reduction methods for network data, through a Multidimensional Scaling. On the other hand, the approach towards textual data involved different methods of feature extraction, without paying particular attention to dimensionality reduction for unstructured data. In this paper we show the empirical results of a two-step strategy to obtain lower-dimensional textual data relying on text mining techniques and principal component analysis. The results show levels of accuracy comparable to classical feature extraction techniques and to the best task models, despite using a much smaller number of predictors.
2022
JADT 2022 - 16th International Conference on Statistical Analysis of Textual Data
Napoli (Italy)
06-08 July, 2022
2
801
808
Sciandra, A.
Dimensionality Reduction of Unstructured and Network Data for Stance Detection / Sciandra, A.. - 2:(2022), pp. 801-808. (Intervento presentato al convegno JADT 2022 - 16th International Conference on Statistical Analysis of Textual Data tenutosi a Napoli (Italy) nel 06-08 July, 2022).
File in questo prodotto:
File Dimensione Formato  
Sciandra_jadt2022_rev.pdf

Open access

Tipologia: Versione dell'autore revisionata e accettata per la pubblicazione
Dimensione 351.24 kB
Formato Adobe PDF
351.24 kB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1284167
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact