HerBERT Based Language Model Detects Quantifiers and Their Semantic Properties in Polish
Marcin Woliński, Bartłomiej Nitoń, Witold Kieraś, Jakub Szymanik
Abstract
The paper presents a tool for automatic marking up of quantifying expressions, their semantic features, and scopes. We explore the idea of using a BERT based neural model for the task (in this case HerBERT, a model trained specifically for Polish, is used). The tool is trained on a recent manually annotated Corpus of Polish Quantificational Expressions (Szymanik and Kieraś, 2022). We discuss how it performs against human annotation and present results of automatic annotation of 300 million sub-corpus of National Corpus of Polish. Our results show that language models can effectively recognise semantic category of quantification as well as identify key semantic properties of quantifiers, like monotonicity. Furthermore, the algorithm we have developed can be used for building semantically annotated quantifier corpora for other languages.- Anthology ID:
- 2022.lrec-1.773
- Volume:
- Proceedings of the Thirteenth Language Resources and Evaluation Conference
- Month:
- June
- Year:
- 2022
- Address:
- Marseille, France
- Editors:
- Nicoletta Calzolari, Frédéric Béchet, Philippe Blache, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Hitoshi Isahara, Bente Maegaard, Joseph Mariani, Hélène Mazo, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association
- Note:
- Pages:
- 7140–7146
- Language:
- URL:
- https://aclanthology.org/2022.lrec-1.773
- DOI:
- Cite (ACL):
- Marcin Woliński, Bartłomiej Nitoń, Witold Kieraś, and Jakub Szymanik. 2022. HerBERT Based Language Model Detects Quantifiers and Their Semantic Properties in Polish. In Proceedings of the Thirteenth Language Resources and Evaluation Conference, pages 7140–7146, Marseille, France. European Language Resources Association.
- Cite (Informal):
- HerBERT Based Language Model Detects Quantifiers and Their Semantic Properties in Polish (Woliński et al., LREC 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.lrec-1.773.pdf
- Data
- KLEJ