Cross-Lingual Content Scoring

Andrea Horbach; Sebastian Stennmanns; Torsten Zesch

doi:10.18653/v1/W18-0550

Cross-Lingual Content Scoring

Andrea Horbach, Sebastian Stennmanns, Torsten Zesch

Abstract

We investigate the feasibility of cross-lingual content scoring, a scenario where training and test data in an automatic scoring task are from two different languages. Cross-lingual scoring can contribute to educational equality by allowing answers in multiple languages. Training a model in one language and applying it to another language might also help to overcome data sparsity issues by re-using trained models from other languages. As there is no suitable dataset available for this new task, we create a comparable bi-lingual corpus by extending the English ASAP dataset with German answers. Our experiments with cross-lingual scoring based on machine-translating either training or test data show a considerable drop in scoring quality.

Anthology ID:: W18-0550
Volume:: Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications
Month:: June
Year:: 2018
Address:: New Orleans, Louisiana
Editors:: Joel Tetreault, Jill Burstein, Ekaterina Kochmar, Claudia Leacock, Helen Yannakoudakis
Venue:: BEA
SIG:: SIGEDU
Publisher:: Association for Computational Linguistics
Note:
Pages:: 410–419
Language:
URL:: https://aclanthology.org/W18-0550
DOI:: 10.18653/v1/W18-0550
Bibkey:
Cite (ACL):: Andrea Horbach, Sebastian Stennmanns, and Torsten Zesch. 2018. Cross-Lingual Content Scoring. In Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 410–419, New Orleans, Louisiana. Association for Computational Linguistics.
Cite (Informal):: Cross-Lingual Content Scoring (Horbach et al., BEA 2018)
Copy Citation:
PDF:: https://preview.aclanthology.org/nschneid-patch-5/W18-0550.pdf

PDF Search