Abstract
This research investigates the collocational errors made by English learners in a learner corpus. It focuses on the extraction of unexpected collocations. A system was proposed and implemented with open source toolkit. Firstly, the collocation extraction module was evaluated by a corpus with manually annotated collocations. Secondly, a standard collocation list was collected from a corpus of native speaker. Thirdly, a list of unexpected collocations was generated by extracting candidates from a learner corpus and discarding the standard collocations on the list. The overall performance was evaluated, and possible sources of error were pointed out for future improvement.- Anthology ID:
- 2020.mwe-1.13
- Volume:
- Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons
- Month:
- December
- Year:
- 2020
- Address:
- online
- Editors:
- Stella Markantonatou, John McCrae, Jelena Mitrović, Carole Tiberius, Carlos Ramisch, Ashwini Vaidya, Petya Osenova, Agata Savary
- Venue:
- MWE
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 101–106
- Language:
- URL:
- https://aclanthology.org/2020.mwe-1.13
- DOI:
- Cite (ACL):
- Jen-Yu Li and Thomas Gaillat. 2020. Automatic detection of unexpected/erroneous collocations in learner corpus. In Proceedings of the Joint Workshop on Multiword Expressions and Electronic Lexicons, pages 101–106, online. Association for Computational Linguistics.
- Cite (Informal):
- Automatic detection of unexpected/erroneous collocations in learner corpus (Li & Gaillat, MWE 2020)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2020.mwe-1.13.pdf