ICLE-RC: International Corpus of Learner English for Relative Clauses

Debopam Das, Izabela Czerniak, Peter Bourgonje


Abstract
We present the ICLE-RC, a corpus of learner English texts annotated for relative clauses and related phenomena. The corpus contains a collection of 144 academic essays from the International Corpus of Learner English (ICLE; Granger et al., 2002), representing six L1 backgrounds – Finnish, Italian, Polish, Swedish, Turkish, and Urdu. These texts are annotated for over 900 relative clauses, with respect to a wide array of lexical, syntactic, semantic, and discourse features. The corpus also provides annotation of over 400 related phenomena (it-clefts, pseudo-clefts, existential-relatives, etc.). Here, we describe the corpus annotation framework, report on the IAA study, discuss the prospects of (semi-)automating annotation, and present the first results from our corpus analysis. We envisage the ICLE-RC to be used as a valuable resource for research on relative clauses in SLA, language typology, World Englishes, and discourse analysis.
Anthology ID:
2025.law-1.16
Volume:
Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025)
Month:
July
Year:
2025
Address:
Vienna, Austria
Editors:
Siyao Peng, Ines Rehbein
Venues:
LAW | WS
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
201–215
Language:
URL:
https://preview.aclanthology.org/landing_page/2025.law-1.16/
DOI:
10.18653/v1/2025.law-1.16
Bibkey:
Cite (ACL):
Debopam Das, Izabela Czerniak, and Peter Bourgonje. 2025. ICLE-RC: International Corpus of Learner English for Relative Clauses. In Proceedings of the 19th Linguistic Annotation Workshop (LAW-XIX-2025), pages 201–215, Vienna, Austria. Association for Computational Linguistics.
Cite (Informal):
ICLE-RC: International Corpus of Learner English for Relative Clauses (Das et al., LAW 2025)
Copy Citation:
PDF:
https://preview.aclanthology.org/landing_page/2025.law-1.16.pdf