Abstract
While discourse parsing has made considerable progress in recent years, discourse segmentation of conversational speech remains a difficult issue. In this paper, we exploit a French data set that has been manually segmented into discourse units to compare two approaches to discourse segmentation: fine-tuning existing systems on manual segmentation vs. using hand-crafted labelling rules to develop a weakly supervised segmenter. Our results show that both approaches yield similar performance in terms of f-score while data programming requires less manual annotation work. In a second experiment we play with the amount of training data used for fine-tuning systems and show that a small amount of hand labelled data is enough to obtain good results (although significantly lower than in the first experiment using all the annotated data available).- Anthology ID:
- 2023.nodalida-1.44
- Volume:
- Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa)
- Month:
- May
- Year:
- 2023
- Address:
- Tórshavn, Faroe Islands
- Editors:
- Tanel Alumäe, Mark Fishel
- Venue:
- NoDaLiDa
- SIG:
- Publisher:
- University of Tartu Library
- Note:
- Pages:
- 436–446
- Language:
- URL:
- https://aclanthology.org/2023.nodalida-1.44
- DOI:
- Cite (ACL):
- Laurent Prevot, Julie Hunter, and Philippe Muller. 2023. Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa), pages 436–446, Tórshavn, Faroe Islands. University of Tartu Library.
- Cite (Informal):
- Comparing Methods for Segmenting Elementary Discourse Units in a French Conversational Corpus (Prevot et al., NoDaLiDa 2023)
- PDF:
- https://preview.aclanthology.org/fix-dup-bibkey/2023.nodalida-1.44.pdf