Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models
Anthony Sicilia, Hyunwoo Kim, Khyathi Chandu, Malihe Alikhani, Jack Hessel
Abstract
Effective interlocutors account for the uncertain goals, beliefs, and emotions of others. But even the best human conversationalist cannot perfectly anticipate the trajectory of a dialogue. How well can language models represent inherent uncertainty in conversations? We propose FortUne Dial, an expansion of the long-standing “conversation forecasting” task: instead of just accuracy, evaluation is conducted with uncertainty-aware metrics, effectively enabling abstention on individual instances. We study two ways in which language models potentially represent outcome uncertainty (internally, using scores and directly, using tokens) and propose fine-tuning strategies to improve calibration of both representations. Experiments on eight difficult negotiation corpora demonstrate that our proposed fine-tuning strategies (a traditional supervision strategy and an off-policy reinforcement learning strategy) can calibrate smaller open-source models to compete with pre-trained models 10x their size.- Anthology ID:
- 2024.findings-acl.697
- Volume:
- Findings of the Association for Computational Linguistics ACL 2024
- Month:
- August
- Year:
- 2024
- Address:
- Bangkok, Thailand and virtual meeting
- Editors:
- Lun-Wei Ku, Andre Martins, Vivek Srikumar
- Venue:
- Findings
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 11700–11726
- Language:
- URL:
- https://aclanthology.org/2024.findings-acl.697
- DOI:
- Cite (ACL):
- Anthony Sicilia, Hyunwoo Kim, Khyathi Chandu, Malihe Alikhani, and Jack Hessel. 2024. Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models. In Findings of the Association for Computational Linguistics ACL 2024, pages 11700–11726, Bangkok, Thailand and virtual meeting. Association for Computational Linguistics.
- Cite (Informal):
- Deal, or no deal (or who knows)? Forecasting Uncertainty in Conversations using Large Language Models (Sicilia et al., Findings 2024)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/2024.findings-acl.697.pdf