Automatic Speech Interruption Detection: Analysis, Corpus, and System
Martin Lebourdais, Marie Tahon, Antoine Laurent, Sylvain Meignier
Abstract
Interruption detection is a new yet challenging task in the field of speech processing. This article presents a comprehensive study on automatic speech interruption detection, from the definition of this task, the assembly of a specialized corpus, and the development of an initial baseline system. We provide three main contributions: Firstly, we define the task, taking into account the nuanced nature of interruptions within spontaneous conversations. Secondly, we introduce a new corpus of conversational data, annotated for interruptions, to facilitate research in this domain. This corpus serves as a valuable resource for evaluating and advancing interruption detection techniques. Lastly, we present a first baseline system, which use speech processing methods to automatically identify interruptions in speech with promising results. In this article, we derivate from theoretical notions of interruption to build a simplification of this notion based on overlapped speech detection. Our findings can not only serve as a foundation for further research in the field but also provide a benchmark for assessing future advancements in automatic speech interruption detection.- Anthology ID:
- 2024.lrec-main.176
- Volume:
- Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
- Month:
- May
- Year:
- 2024
- Address:
- Torino, Italia
- Editors:
- Nicoletta Calzolari, Min-Yen Kan, Veronique Hoste, Alessandro Lenci, Sakriani Sakti, Nianwen Xue
- Venues:
- LREC | COLING
- SIG:
- Publisher:
- ELRA and ICCL
- Note:
- Pages:
- 1959–1968
- Language:
- URL:
- https://aclanthology.org/2024.lrec-main.176
- DOI:
- Cite (ACL):
- Martin Lebourdais, Marie Tahon, Antoine Laurent, and Sylvain Meignier. 2024. Automatic Speech Interruption Detection: Analysis, Corpus, and System. In Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024), pages 1959–1968, Torino, Italia. ELRA and ICCL.
- Cite (Informal):
- Automatic Speech Interruption Detection: Analysis, Corpus, and System (Lebourdais et al., LREC-COLING 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-2/2024.lrec-main.176.pdf