Coherence-based Dialogue Discourse Structure Extraction using Open-Source Large Language Models
Gaetano Cimino, Chuyuan Li, Giuseppe Carenini, Vincenzo Deufemia
Abstract
Despite the challenges posed by data sparsity in discourse parsing for dialogues, unsupervised methods have been underexplored. Leveraging recent advances in Large Language Models (LLMs), in this paper we investigate an unsupervised coherence-based method to build discourse structures for multi-party dialogues using open-source LLMs fine-tuned on conversational data. Specifically, we propose two algorithms that extract dialogue structures by identifying their most coherent sub-dialogues: DS-DP employs a dynamic programming strategy, while DS-FLOW applies a greedy approach. Evaluation on the STAC corpus demonstrates a micro-F1 score of 58.1%, surpassing prior unsupervised methods. Furthermore, on a cleaned subset of the Molweni corpus, the proposed method achieves a micro-F1 score of 74.7%, highlighting its effectiveness across different corpora.- Anthology ID:
- 2024.sigdial-1.26
- Volume:
- Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue
- Month:
- September
- Year:
- 2024
- Address:
- Kyoto, Japan
- Editors:
- Tatsuya Kawahara, Vera Demberg, Stefan Ultes, Koji Inoue, Shikib Mehri, David Howcroft, Kazunori Komatani
- Venue:
- SIGDIAL
- SIG:
- SIGDIAL
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 297–316
- Language:
- URL:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.sigdial-1.26/
- DOI:
- 10.18653/v1/2024.sigdial-1.26
- Cite (ACL):
- Gaetano Cimino, Chuyuan Li, Giuseppe Carenini, and Vincenzo Deufemia. 2024. Coherence-based Dialogue Discourse Structure Extraction using Open-Source Large Language Models. In Proceedings of the 25th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 297–316, Kyoto, Japan. Association for Computational Linguistics.
- Cite (Informal):
- Coherence-based Dialogue Discourse Structure Extraction using Open-Source Large Language Models (Cimino et al., SIGDIAL 2024)
- PDF:
- https://preview.aclanthology.org/jlcl-multiple-ingestion/2024.sigdial-1.26.pdf