That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory

Xuemei Tang, Qi Su


Abstract
The evolution of language follows the rule of gradual change. Grammar, vocabulary, and lexical semantic shifts take place over time, resulting in a diachronic linguistic gap. As such, a considerable amount of texts are written in languages of different eras, which creates obstacles for natural language processing tasks, such as word segmentation and machine translation. Although the Chinese language has a long history, previous Chinese natural language processing research has primarily focused on tasks within a specific era. Therefore, we propose a cross-era learning framework for Chinese word segmentation (CWS), CROSSWISE, which uses the Switch-memory (SM) module to incorporate era-specific linguistic knowledge. Experiments on four corpora from different eras show that the performance of each corpus significantly improves. Further analyses also demonstrate that the SM can effectively integrate the knowledge of the eras into the neural network.
Anthology ID:
2022.acl-long.540
Volume:
Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
May
Year:
2022
Address:
Dublin, Ireland
Editors:
Smaranda Muresan, Preslav Nakov, Aline Villavicencio
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
7830–7840
Language:
URL:
https://aclanthology.org/2022.acl-long.540
DOI:
10.18653/v1/2022.acl-long.540
Bibkey:
Cite (ACL):
Xuemei Tang and Qi Su. 2022. That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 7830–7840, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
That Slepen Al the Nyght with Open Ye! Cross-era Sequence Segmentation with Switch-memory (Tang & Su, ACL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/improve-issue-templates/2022.acl-long.540.pdf