In this article, we investigate how speakers can be categorised based on their language background in the field of Learner Corpus Research (LCR). Specifically, we discuss three key aspects: first, the theoretical assumptions and methodological choices made in learner corpus design, second the integration of a holistic perspective for speaker categorisation in LCR and third the consequences that different categorisations might have on study outcomes. Through a comprehensive review of corpora used in the field, we identify the most common terms, definitions and criteria of categorisation used to describe a speaker's language background. Focusing on the most central metadata encoding language backgrounds, the L1 metadata, we inspect different operationalisations made and scrutinise the theoretical assumptions underlying them. Drawing on research on plurilingualism, we propose a holistic view of speaker's language background for Learner Corpus Research, combining various aspects of speaker's language use by methods inspired from the Dominant Language Constellation framework. We apply this methodology to re-evaluate the language categorisation system in LEONIDE, a multilingual corpus of Italian, German and English texts from secondary school students of diverse language backgrounds. We use the same corpus to evaluate the consequences of using different categorisations of the students on the outcome of possible linguistic studies. Despite a generally high overlap between study results across categorisations, we observe that variables combining multiple aspects of the speakers’ language backgrounds seem to explain group differences for more of the linguistic features investigated.

Categorising speakers’ language background: Theoretical assumptions and methodological challenges for learner corpus research / Lopopolo, Olga; Bienati, Arianna; Frey, Jennifer-Carmen; Glaznieks, Aivars; Spina, Stefania. - In: RESEARCH METHODS IN APPLIED LINGUISTICS. - ISSN 2772-7661. - 4:1(2025), pp. 100170-100170. [10.1016/j.rmal.2024.100170]

Categorising speakers’ language background: Theoretical assumptions and methodological challenges for learner corpus research

Bienati, Arianna;
2025

Abstract

In this article, we investigate how speakers can be categorised based on their language background in the field of Learner Corpus Research (LCR). Specifically, we discuss three key aspects: first, the theoretical assumptions and methodological choices made in learner corpus design, second the integration of a holistic perspective for speaker categorisation in LCR and third the consequences that different categorisations might have on study outcomes. Through a comprehensive review of corpora used in the field, we identify the most common terms, definitions and criteria of categorisation used to describe a speaker's language background. Focusing on the most central metadata encoding language backgrounds, the L1 metadata, we inspect different operationalisations made and scrutinise the theoretical assumptions underlying them. Drawing on research on plurilingualism, we propose a holistic view of speaker's language background for Learner Corpus Research, combining various aspects of speaker's language use by methods inspired from the Dominant Language Constellation framework. We apply this methodology to re-evaluate the language categorisation system in LEONIDE, a multilingual corpus of Italian, German and English texts from secondary school students of diverse language backgrounds. We use the same corpus to evaluate the consequences of using different categorisations of the students on the outcome of possible linguistic studies. Despite a generally high overlap between study results across categorisations, we observe that variables combining multiple aspects of the speakers’ language backgrounds seem to explain group differences for more of the linguistic features investigated.
2025
4
1
100170
100170
Categorising speakers’ language background: Theoretical assumptions and methodological challenges for learner corpus research / Lopopolo, Olga; Bienati, Arianna; Frey, Jennifer-Carmen; Glaznieks, Aivars; Spina, Stefania. - In: RESEARCH METHODS IN APPLIED LINGUISTICS. - ISSN 2772-7661. - 4:1(2025), pp. 100170-100170. [10.1016/j.rmal.2024.100170]
Lopopolo, Olga; Bienati, Arianna; Frey, Jennifer-Carmen; Glaznieks, Aivars; Spina, Stefania
File in questo prodotto:
File Dimensione Formato  
Lopopolo et al. - 2025 - Categorising speakers’ language background Theore.pdf

Open access

Tipologia: VOR - Versione pubblicata dall'editore
Licenza: [IR] creative-commons
Dimensione 1.43 MB
Formato Adobe PDF
1.43 MB Adobe PDF Visualizza/Apri
Pubblicazioni consigliate

Licenza Creative Commons
I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/11380/1382171
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact