InterDiplo-Covid19 Corpus

Cavalieri, S.; Corrizzato, S.; Facchinetti, R.

The InterDiplo-Covid19 Corpus is a corpus of interviews in which diplomats and international operators are interviewed on the spread and the political/social/economic consequences of Covid19, i.e. the InterDiplo-Covid 19 corpus. The interviews were collected from the most famous international broadcasting companies (e.g. BBC, CNN, CGNT, ARIRANG, SKY NEWS UK, FRANCE 24 ENGLISH) or, due to VPN issues, on their YouTube channel where they often publish complete interviews. Diplomats and international operators are interviewed in English by journalists who do not share the same lingua-cultural background as they can be both native and non-native speakers of English. The corpus is in xml format and tagged for metadata, parts of speech, discursive features, questions and answers. The InterDiplo-Covid 19 corpus includes 80 interviews, and it was collected within a year timespan from February 2020 February 2021. It consists of 236,000 tokens and, to have a balanced corpus, the 80 interviews were grouped into 4 sub-corpora: 1) 20 interviews in which the interviewer and the interviewee are both native speakers of English; 2) 20 interviews in which the interviewer is a native speaker of English whereas the interviewee is a non-native speaker of English; 3) 20 interviews in which the interviewer is a non-native speaker of English whereas the interviewee is a native speaker of English; 4) 20 interviews in which the interviewer and the interviewee are both non-native speakers of English.

InterDiplo-Covid19 Corpus / Cavalieri, S.; Corrizzato, S.; Facchinetti, R.. - (2021).

InterDiplo-Covid19 Corpus

Cavalieri S.;Corrizzato S.;Facchinetti R.

2021

Abstract

The InterDiplo-Covid19 Corpus is a corpus of interviews in which diplomats and international operators are interviewed on the spread and the political/social/economic consequences of Covid19, i.e. the InterDiplo-Covid 19 corpus. The interviews were collected from the most famous international broadcasting companies (e.g. BBC, CNN, CGNT, ARIRANG, SKY NEWS UK, FRANCE 24 ENGLISH) or, due to VPN issues, on their YouTube channel where they often publish complete interviews. Diplomats and international operators are interviewed in English by journalists who do not share the same lingua-cultural background as they can be both native and non-native speakers of English. The corpus is in xml format and tagged for metadata, parts of speech, discursive features, questions and answers. The InterDiplo-Covid 19 corpus includes 80 interviews, and it was collected within a year timespan from February 2020 February 2021. It consists of 236,000 tokens and, to have a balanced corpus, the 80 interviews were grouped into 4 sub-corpora: 1) 20 interviews in which the interviewer and the interviewee are both native speakers of English; 2) 20 interviews in which the interviewer is a native speaker of English whereas the interviewee is a non-native speaker of English; 3) 20 interviews in which the interviewer is a non-native speaker of English whereas the interviewee is a native speaker of English; 4) 20 interviews in which the interviewer and the interviewee are both non-native speakers of English.

Scheda breve

Scheda completa

Scheda completa (DC)

	Anno di pubblicazione
	
				2021
			
	Tutti gli autori
	
						Cavalieri, S.; Corrizzato, S.; Facchinetti, R.
					
	Tipologia
	
				Banca dati

File in questo prodotto:

Non ci sono file associati a questo prodotto.

Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris