Recent studies have shown the potential of Large Language Models (LLMs) to annotate text corpora for genre moves, but also cautioned against their errors and inconsistencies. To improve the accuracy of LLM-assisted annotation, this study presents a human-AI collaborative approach that involves human verification focusing on resolving inconsistencies revealed in separate rounds of LLM-generated annotations. Specifically, we examined the inconsistencies between GPT-4 and GPT-4o in annotating rhetorical moves in 800 abstracts from four applied linguistics journals. This approach uncovered similarities and differences in error patterns between the two models, indicating their different focuses in linguistic recognition, especially with instances marked by vagueness. This suggests the need to use different LLMs for annotation, as relying on a single model may lead to repeated errors of the same type. The findings show that a human-AI collaborative approach enhances annotation accuracy significantly. Moreover, analysing inconsistencies provides opportunities to calibrate the annotation scheme and refine linguistic theory.
Enhancing LLM-assisted move annotation: A focus on inter-model inconsistencies / Yu, D., Yu, R., Bondi, M., Hyland, K.. - In: RESEARCH METHODS IN APPLIED LINGUISTICS. - ISSN 2772-7661. - 5:2(2026), pp. 1-16. [10.1016/j.rmal.2026.100328]
Enhancing LLM-assisted move annotation: A focus on inter-model inconsistencies
Yu, Danni
;Bondi, Marina
;Hyland, Ken
2026
Abstract
Recent studies have shown the potential of Large Language Models (LLMs) to annotate text corpora for genre moves, but also cautioned against their errors and inconsistencies. To improve the accuracy of LLM-assisted annotation, this study presents a human-AI collaborative approach that involves human verification focusing on resolving inconsistencies revealed in separate rounds of LLM-generated annotations. Specifically, we examined the inconsistencies between GPT-4 and GPT-4o in annotating rhetorical moves in 800 abstracts from four applied linguistics journals. This approach uncovered similarities and differences in error patterns between the two models, indicating their different focuses in linguistic recognition, especially with instances marked by vagueness. This suggests the need to use different LLMs for annotation, as relying on a single model may lead to repeated errors of the same type. The findings show that a human-AI collaborative approach enhances annotation accuracy significantly. Moreover, analysing inconsistencies provides opportunities to calibrate the annotation scheme and refine linguistic theory.| File | Dimensione | Formato | |
|---|---|---|---|
|
YU_YU_BONDI_HYLAND2026_1-s2.0-S2772766126000340-main.pdf
Accesso riservato
Tipologia:
VOR - Versione pubblicata dall'editore
Dimensione
1.5 MB
Formato
Adobe PDF
|
1.5 MB | Adobe PDF | Visualizza/Apri Richiedi una copia |
Pubblicazioni consigliate

I metadati presenti in IRIS UNIMORE sono rilasciati con licenza Creative Commons CC0 1.0 Universal, mentre i file delle pubblicazioni sono rilasciati con licenza Attribuzione 4.0 Internazionale (CC BY 4.0), salvo diversa indicazione.
In caso di violazione di copyright, contattare Supporto Iris




