Abstract
In the news, statements from information sources are often quoted, made by individuals who interact in the news. Detecting those quotes and the gender of their sources is a key task when it comes to media analysis from a gender perspective. It is a challenging task: the structure of the quotes is variable, gender marks are not present in many languages, and quote authors are often omitted due to frequent use of coreferences. This paper proposes a strategy to measure the presence of women and men as information sources in news. We approach the problem of detecting sentences including quotes and the gender of the speaker as a joint task, by means of a supervised multiclass classifier of sentences. We have created the first datasets for Spanish and Basque by manually annotating quotes and the gender of the associated sources in news items. The results obtained show that BERT based approaches are significantly better than bag-of-words based classical ones, achieving accuracies close to 90%. We also analyse a bilingual learning strategy and generating additional training examples synthetically; both provide improvements up to 3.4% and 5.6%, respectively.- Anthology ID:
- 2022.latechclfl-1.15
- Volume:
- Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
- Month:
- October
- Year:
- 2022
- Address:
- Gyeongju, Republic of Korea
- Editors:
- Stefania Degaetano, Anna Kazantseva, Nils Reiter, Stan Szpakowicz
- Venue:
- LaTeCHCLfL
- SIG:
- SIGHUM
- Publisher:
- International Conference on Computational Linguistics
- Note:
- Pages:
- 126–134
- Language:
- URL:
- https://aclanthology.org/2022.latechclfl-1.15
- DOI:
- Cite (ACL):
- Muitze Zulaika, Xabier Saralegi, and Iñaki San Vicente. 2022. Measuring Presence of Women and Men as Information Sources in News. In Proceedings of the 6th Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 126–134, Gyeongju, Republic of Korea. International Conference on Computational Linguistics.
- Cite (Informal):
- Measuring Presence of Women and Men as Information Sources in News (Zulaika et al., LaTeCHCLfL 2022)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-1/2022.latechclfl-1.15.pdf