Optimizing text representations to capture (dis)similarity between political parties

Tanise Ceron, Nico Blokker, Sebastian Padó


Abstract
Even though fine-tuned neural language models have been pivotal in enabling “deep” automatic text analysis, optimizing text representations for specific applications remains a crucial bottleneck. In this study, we look at this problem in the context of a task from computational social science, namely modeling pairwise similarities between political parties. Our research question is what level of structural information is necessary to create robust text representation, contrasting a strongly informed approach (which uses both claim span and claim category annotations) with approaches that forgo one or both types of annotation with document structure-based heuristics. Evaluating our models on the manifestos of German parties for the 2021 federal election. We find that heuristics that maximize within-party over between-party similarity along with a normalization step lead to reliable party similarity prediction, without the need for manual annotation.
Anthology ID:
2022.conll-1.22
Volume:
Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Venue:
CoNLL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
325–338
Language:
URL:
https://aclanthology.org/2022.conll-1.22
DOI:
Bibkey:
Cite (ACL):
Tanise Ceron, Nico Blokker, and Sebastian Padó. 2022. Optimizing text representations to capture (dis)similarity between political parties. In Proceedings of the 26th Conference on Computational Natural Language Learning (CoNLL), pages 325–338, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Optimizing text representations to capture (dis)similarity between political parties (Ceron et al., CoNLL 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.conll-1.22.pdf