Abstract
We present a simple approach to the generation and labeling of extraction patterns for coding political event data, an important task in computational social science. We use weak supervision to identify pattern candidates and learn distributed representations for them. Given seed extraction patterns from existing pattern dictionaries, we use label propagation to label pattern candidates. We present two case studies. i) We derive patterns of acceptable quality for a number of international relations & conflicts categories using pattern candidates of O’Connor et al (2013). ii) We derive patterns for coding protest events that outperform an established set of Tabari / Petrarch hand-crafted patterns.- Anthology ID:
- W18-4512
- Volume:
- Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico
- Venue:
- LaTeCH
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 103–112
- Language:
- URL:
- https://aclanthology.org/W18-4512
- DOI:
- Cite (ACL):
- Peter Makarov. 2018. Automated Acquisition of Patterns for Coding Political Event Data: Two Case Studies. In Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature, pages 103–112, Santa Fe, New Mexico. Association for Computational Linguistics.
- Cite (Informal):
- Automated Acquisition of Patterns for Coding Political Event Data: Two Case Studies (Makarov, LaTeCH 2018)
- PDF:
- https://preview.aclanthology.org/remove-xml-comments/W18-4512.pdf