Multilingual Power and Ideology identification in the Parliament: a reference dataset and simple baselines

Çağrı Çöltekin, Matyáš Kopp, Meden Katja, Vaidas Morkevicius, Nikola Ljubešić, Tomaž Erjavec


Abstract
We introduce a dataset on political orientation and power position identification. The dataset is derived from ParlaMint, a set of comparable corpora of transcribed parliamentary speeches from 29 national and regional parliaments. We introduce the dataset, provide the reasoning behind some of the choices during its creation, present statistics on the dataset, and, using a simple classifier, some baseline results on predicting political orientation on the left-to-right axis, and on power position identification, i.e., distinguishing between the speeches delivered by governing coalition party members from those of opposition party members.
Anthology ID:
2024.parlaclarin-1.14
Volume:
Proceedings of the IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora (ParlaCLARIN) @ LREC-COLING 2024
Month:
May
Year:
2024
Address:
Torino, Italia
Editors:
Darja Fiser, Maria Eskevich, David Bordon
Venues:
ParlaCLARIN | WS
SIG:
Publisher:
ELRA and ICCL
Note:
Pages:
94–100
Language:
URL:
https://aclanthology.org/2024.parlaclarin-1.14
DOI:
Bibkey:
Cite (ACL):
Çağrı Çöltekin, Matyáš Kopp, Meden Katja, Vaidas Morkevicius, Nikola Ljubešić, and Tomaž Erjavec. 2024. Multilingual Power and Ideology identification in the Parliament: a reference dataset and simple baselines. In Proceedings of the IV Workshop on Creating, Analysing, and Increasing Accessibility of Parliamentary Corpora (ParlaCLARIN) @ LREC-COLING 2024, pages 94–100, Torino, Italia. ELRA and ICCL.
Cite (Informal):
Multilingual Power and Ideology identification in the Parliament: a reference dataset and simple baselines (Çöltekin et al., ParlaCLARIN-WS 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/2024.parlaclarin-1.14.pdf