Chinese Synesthesia Detection: New Dataset and Models

Xiaotong Jiang, Qingqing Zhao, Yunfei Long, Zhongqing Wang


Abstract
In this paper, we introduce a new task called synesthesia detection, which aims to extract the sensory word of a sentence, and to predict the original and synesthetic sensory modalities of the corresponding sensory word. Synesthesia refers to the description of perceptions in one sensory modality through concepts from other modalities. It involves not only a linguistic phenomenon, but also a cognitive phenomenon structuring human thought and action, which makes it become a bridge between figurative linguistic phenomenon and abstract cognition, and thus be helpful to understand the deep semantics. To address this, we construct a large-scale human-annotated Chinese synesthesia dataset, which contains 7,217 annotated sentences accompanied by 187 sensory words. Based on this dataset, we propose a family of strong and representative baseline models. Upon these baselines, we further propose a radical-based neural network model to identify the boundary of the sensory word, and to jointly detect the original and synesthetic sensory modalities for the word. Through extensive experiments, we observe that the importance of the proposed task and dataset can be verified by the statistics and progressive performances. In addition, our proposed model achieves state-of-the-art results on the synesthesia dataset.
Anthology ID:
2022.findings-acl.306
Volume:
Findings of the Association for Computational Linguistics: ACL 2022
Month:
May
Year:
2022
Address:
Dublin, Ireland
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
3877–3887
Language:
URL:
https://aclanthology.org/2022.findings-acl.306
DOI:
10.18653/v1/2022.findings-acl.306
Bibkey:
Cite (ACL):
Xiaotong Jiang, Qingqing Zhao, Yunfei Long, and Zhongqing Wang. 2022. Chinese Synesthesia Detection: New Dataset and Models. In Findings of the Association for Computational Linguistics: ACL 2022, pages 3877–3887, Dublin, Ireland. Association for Computational Linguistics.
Cite (Informal):
Chinese Synesthesia Detection: New Dataset and Models (Jiang et al., Findings 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.findings-acl.306.pdf