text2story: A Python Toolkit to Extract and Visualize Story Components of Narrative Text
Evelin Amorim, Ricardo Campos, Alipio Jorge, Pedro Mota, Rúben Almeida
Abstract
Story components, namely, events, time, participants, and their relations are present in narrative texts from different domains such as journalism, medicine, finance, and law. The automatic extraction of narrative elements encompasses several NLP tasks such as Named Entity Recognition, Semantic Role Labeling, Event Extraction, Coreference resolution, and Temporal Inference. The text2story python, an easy-to-use modular library, supports the narrative extraction and visualization pipeline. The package contains an array of narrative extraction tools that can be used separately or in sequence. With this toolkit, end users can process free text in English or Portuguese and obtain formal representations, like standard annotation files or a formal logical representation. The toolkit also enables narrative visualization as Message Sequence Charts (MSC), Knowledge Graphs, and Bubble Diagrams, making it useful to visualize and transform human-annotated narratives. The package combines the use of off-the-shelf and custom tools and is easily patched (replacing existing components) and extended (e.g. with new visualizations). It includes an experimental module for narrative element effectiveness assessment and being is therefore also a valuable asset for researchers developing solutions for narrative extraction. To evaluate the baseline components, we present some results of the main annotators embedded in our packages for datasets in English and Portuguese. We also compare the results with the extraction of narrative elements by GPT-3, a robust LLM model.- Anthology ID:
- 2024.lrec-main.1369
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 15761–15772
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.1369
- DOI:
- Cite (ACL):
- Evelin Amorim, Ricardo Campos, Alipio Jorge, Pedro Mota, and Rúben Almeida. 2024. text2story: A Python Toolkit to Extract and Visualize Story Components of Narrative Text. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 15761–15772, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- text2story: A Python Toolkit to Extract and Visualize Story Components of Narrative Text (Amorim et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2024.lrec-main.1369.pdf