Political Event Coding as Text-to-Text Sequence Generation

Yaoyao Dai, Benjamin Radford, Andrew Halterman


Abstract
We report on the current status of an effort to produce political event data from unstructured text via a Transformer language model. Compelled by the current lack of publicly available and up-to-date event coding software, we seek to train a model that can produce structured political event records at the sentence level. Our approach differs from previous efforts in that we conceptualize this task as one of text-to-text sequence generation. We motivate this choice by outlining desirable properties of text generation models for the needs of event coding. To overcome the lack of sufficient training data, we also describe a method for generating synthetic text and event record pairs that we use to fit our model.
Anthology ID:
2022.case-1.16
Volume:
Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE)
Month:
December
Year:
2022
Address:
Abu Dhabi, United Arab Emirates (Hybrid)
Venue:
CASE
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
117–123
Language:
URL:
https://aclanthology.org/2022.case-1.16
DOI:
Bibkey:
Cite (ACL):
Yaoyao Dai, Benjamin Radford, and Andrew Halterman. 2022. Political Event Coding as Text-to-Text Sequence Generation. In Proceedings of the 5th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE), pages 117–123, Abu Dhabi, United Arab Emirates (Hybrid). Association for Computational Linguistics.
Cite (Informal):
Political Event Coding as Text-to-Text Sequence Generation (Dai et al., CASE 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2022.case-1.16.pdf