Abstract
We investigate the ability of Transformer-based language models to find syntactic differences between the English of the early 1800s and that of the late 1900s. First, we show that a fine-tuned BERT model can distinguish between text from these two periods using syntactic information only; to show this, we employ a strategy to hide semantic information from the text. Second, we make further use of fine-tuned BERT models to identify specific instances of syntactic change and specific words for which a new part of speech was introduced. To do this, we employ an automatic part-of-speech (POS) tagger and use it to train corpora-specific taggers based only on BERT representations pretrained on different corpora. Notably, our methods of identifying specific candidates for syntactic change avoid using any automatic POS tagger on old text, where its performance may be unreliable; instead, our methods only use untagged old text together with tagged modern text. We examine samples and distributional properties of the model output to validate automatically identified cases of syntactic change. Finally, we use our techniques to confirm the historical rise of the progressive construction, a known example of syntactic change.- Anthology ID:
- 2023.findings-emnlp.230
- Volume:
- Findings of the Association for Computational Linguistics: EMNLP 2023
- Month:
- December
- Year:
- 2023
- Address:
- Singapore
- Editors:
- Houda Bouamor, Juan Pino, Kalika Bali
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 3564–3574
- Language:
- URL:
- https://aclanthology.org/2023.findings-emnlp.230
- DOI:
- 10.18653/v1/2023.findings-emnlp.230
- Cite (ACL):
- Liwen Hou and David Smith. 2023. Detecting Syntactic Change with Pre-trained Transformer Models. In Findings of the Association for Computational Linguistics: EMNLP 2023, pages 3564–3574, Singapore. Association for Computational Linguistics.
- Cite (Informal):
- Detecting Syntactic Change with Pre-trained Transformer Models (Hou & Smith, Findings 2023)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2023.findings-emnlp.230.pdf