Abstract
This paper presents the results of two experiments carried out within the framework of computational construction grammar. Starting from the constructionist point of view that there are just constructions in language, including lexical ones, we tested the validity of a clustering algorithm that was primarily designed for MWE extraction, the cpr-score (Colson, 2017), on Chinese word segmentation. Our results indicate a striking recall rate of 75 percent without any special adaptation to Chinese or to the lexicon, which confirms that there is some similarity between extracting MWEs and CWS. Our second experiment also suggests that the same methodology might be used for extracting more schematic or abstract constructions, thereby providing evidence for the statistical foundation of construction grammar.- Anthology ID:
- W18-4907
- Volume:
- Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018)
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Agata Savary, Carlos Ramisch, Jena D. Hwang, Nathan Schneider, Melanie Andresen, Sameer Pradhan, Miriam R. L. Petruck
- Venues:
- LAW | MWE
- SIGs:
- SIGLEX | SIGANN
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 41–50
- Language:
- URL:
- https://aclanthology.org/W18-4907
- DOI:
- Cite (ACL):
- Jean-Pierre Colson. 2018. From Chinese Word Segmentation to Extraction of Constructions: Two Sides of the Same Algorithmic Coin. In Proceedings of the Joint Workshop on Linguistic Annotation, Multiword Expressions and Constructions (LAW-MWE-CxG-2018), pages 41–50, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- From Chinese Word Segmentation to Extraction of Constructions: Two Sides of the Same Algorithmic Coin (Colson, LAW-MWE 2018)
- PDF:
- https://preview.aclanthology.org/landing_page/W18-4907.pdf