Annotating Japanese Numeral Expressions for a Logical and Pragmatic Inference Dataset

Kana Koyano, Hitomi Yanaka, Koji Mineshima, Daisuke Bekki


Abstract
Numeral expressions in Japanese are characterized by the flexibility of quantifier positions and the variety of numeral suffixes. However, little work has been done to build annotated corpora focusing on these features and datasets for testing the understanding of Japanese numeral expressions. In this study, we build a corpus that annotates each numeral expression in an existing phrase structure-based Japanese treebank with its usage and numeral suffix types. We also construct an inference test set for numerical expressions based on this annotated corpus. In this test set, we particularly pay attention to inferences where the correct label differs between logical entailment and implicature and those contexts such as negations and conditionals where the entailment labels can be reversed. The baseline experiment with Japanese BERT models shows that our inference test set poses challenges for inference involving various types of numeral expressions.
Anthology ID:
2022.isa-1.17
Volume:
Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022
Month:
June
Year:
2022
Address:
Marseille, France
Editor:
Harry Bunt
Venue:
ISA
SIG:
Publisher:
European Language Resources Association
Note:
Pages:
127–132
Language:
URL:
https://aclanthology.org/2022.isa-1.17
DOI:
Bibkey:
Cite (ACL):
Kana Koyano, Hitomi Yanaka, Koji Mineshima, and Daisuke Bekki. 2022. Annotating Japanese Numeral Expressions for a Logical and Pragmatic Inference Dataset. In Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022, pages 127–132, Marseille, France. European Language Resources Association.
Cite (Informal):
Annotating Japanese Numeral Expressions for a Logical and Pragmatic Inference Dataset (Koyano et al., ISA 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl24-info/2022.isa-1.17.pdf