The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task

Shanshan Xu; Katja Markert

The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task

Abstract

The disambiguation of causative-passive homonymy (CPH) is potentially tricky for machines, as the causative and the passive are not distinguished by the sentences’ syntactic structure. By transforming CPH disambiguation to a challenging natural language inference (NLI) task, we present the first Chinese Adversarial NLI challenge set (CANLI). We show that the pretrained transformer model RoBERTa, fine-tuned on an existing large-scale Chinese NLI benchmark dataset, performs poorly on CANLI. We also employ Word Sense Disambiguation as a probing task to investigate to what extent the CPH feature is captured in the model’s internal representation. We find that the model’s performance on CANLI does not correspond to its internal representation of CPH, which is the crucial linguistic ability central to the CANLI dataset. CANLI is available on Hugging Face Datasets (Lhoest et al., 2021) at https://huggingface.co/datasets/sxu/CANLI

Anthology ID:: 2022.lrec-1.460
Volume:: Proceedings of the Thirteenth Language Resources and Evaluation Conference
Month:: June
Year:: 2022
Address:: Marseille, France
Editors:: Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
Venue:: LREC
SIG:
Publisher:: European Language Resources Association
Note:
Pages:: 4316–4323
Language:
URL:: https://preview.aclanthology.org/fix-sig-urls/2022.lrec-1.460/
DOI:
Bibkey:
Cite (ACL):: Shanshan Xu and Katja Markert. 2022. The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 4316–4323, Marseille, France. European Language Resources Association.
Cite (Informal):: The Chinese Causative-Passive Homonymy Disambiguation: an adversarial Dataset for NLI and a Probing Task (Xu & Markert, LREC 2022)
Copy Citation:
PDF:: https://preview.aclanthology.org/fix-sig-urls/2022.lrec-1.460.pdf

PDF Cite Search Fix data