Abstract
This paper presents a method of detecting fine-grained categories of propaganda in text. Given a sentence, our method aims to identify a span of words and predict the type of propaganda used. To detect propaganda, we explore a method for extracting features of propaganda from contextualized embeddings without fine-tuning the large parameters of the base model. We show that by generating synthetic embeddings we can train a linear function with ReLU activation to extract useful labeled embeddings from an embedding space generated by a general-purpose language model. We also introduce an inference technique to detect continuous spans in sequences of propaganda tokens in sentences. A result of the ensemble model is submitted to the first shared task in fine-grained propaganda detection at NLP4IF as Team Stalin. In this paper, we provide additional analysis regarding our method of detecting spans of propaganda with synthetically generated representations.- Anthology ID:
- D19-5023
- Volume:
- Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda
- Month:
- November
- Year:
- 2019
- Address:
- Hong Kong, China
- Editors:
- Anna Feldman, Giovanni Da San Martino, Alberto Barrón-Cedeño, Chris Brew, Chris Leberknight, Preslav Nakov
- Venue:
- NLP4IF
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 155–161
- Language:
- URL:
- https://aclanthology.org/D19-5023
- DOI:
- 10.18653/v1/D19-5023
- Cite (ACL):
- Adam Ek and Mehdi Ghanimifard. 2019. Synthetic Propaganda Embeddings To Train A Linear Projection. In Proceedings of the Second Workshop on Natural Language Processing for Internet Freedom: Censorship, Disinformation, and Propaganda, pages 155–161, Hong Kong, China. Association for Computational Linguistics.
- Cite (Informal):
- Synthetic Propaganda Embeddings To Train A Linear Projection (Ek & Ghanimifard, NLP4IF 2019)
- PDF:
- https://preview.aclanthology.org/naacl24-info/D19-5023.pdf