Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models

Yu Wang, Leyi Lao, Langchu Huang, Gabriel Skantze, Yang Xu, Hendrik Buschmeier


Abstract
Backchannels and fillers are important linguistic expressions in dialogue, but often treated as "noise" to be bypassed in modern transformer-based language models. Our work studies the representation of them in language models using three fine-tuning strategies. The models are trained on three dialogue corpora in English and Japanese, where backchannels and fillers are preserved and annotated, to investigate how fine-tuning can help LMs learn their representations. We first apply clustering analysis to the learnt representation of backchannels and fillers, and have found increased silhouette scores in representations from fine-tuned models, which suggests that fine-tuning enables LMs to distinguish the nuanced semantic variation in different backchannel and filler use. We also use natural language generation (NLG) metrics and qualitative analysis to confirm that the utterances generated by fine-tuned language models resemble human-produced utterances more closely. Our findings suggest the potentials of transforming general LMs into conversational LMs that are more capable of producing human-like languages adequately.
Anthology ID:
2026.acl-long.241
Volume:
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Month:
July
Year:
2026
Address:
San Diego, California, United States
Editors:
Maria Liakata, Viviane P. Moreira, Jiajun Zhang, David Jurgens
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
5319–5348
Language:
URL:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.241/
DOI:
Bibkey:
Cite (ACL):
Yu Wang, Leyi Lao, Langchu Huang, Gabriel Skantze, Yang Xu, and Hendrik Buschmeier. 2026. Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models. In Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5319–5348, San Diego, California, United States. Association for Computational Linguistics.
Cite (Informal):
Investigating the Representation of Backchannels and Fillers in Fine-tuned Language Models (Wang et al., ACL 2026)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-acl/2026.acl-long.241.pdf
Checklist:
 2026.acl-long.241.checklist.pdf