Abstract
We propose a framework for quantitative-qualitative research in corpus-assisted discourse studies (CADS), which operationalises the central process of manually forming groups of related words and phrases in terms of “discoursemes” and their constellations. We introduce an open-source implementation of this framework in the form of a REST API based on Corpus Workbench. Going through the workflow of a collocation analysis for fleeing and related terms in the German Federal Parliament, the paper gives details about the underlying algorithms, with available parameters and further possible choices. We also address multi-word units (which are often disregarded by CADS tools), a semantic map visualisation of collocations, and how to compute assocations between discoursemes.- Anthology ID:
- 2024.cpss-1.3
- Volume:
- Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers
- Month:
- Sep
- Year:
- 2024
- Address:
- Vienna, Austria
- Editors:
- Christopher Klamm, Gabriella Lapesa, Simone Paolo Ponzetto, Ines Rehbein, Indira Sen
- Venues:
- cpss | WS
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 33–44
- Language:
- URL:
- https://aclanthology.org/2024.cpss-1.3
- DOI:
- Cite (ACL):
- Philipp Heinrich and Stephanie Evert. 2024. Operationalising the Hermeneutic Grouping Process in Corpus-assisted Discourse Studies. In Proceedings of the 4th Workshop on Computational Linguistics for the Political and Social Sciences: Long and short papers, pages 33–44, Vienna, Austria. Association for Computational Linguistics.
- Cite (Informal):
- Operationalising the Hermeneutic Grouping Process in Corpus-assisted Discourse Studies (Heinrich & Evert, cpss-WS 2024)
- PDF:
- https://preview.aclanthology.org/add_acl24_videos/2024.cpss-1.3.pdf