Kevin Zhou
2025
Evaluating Uncertainty Quantification Methods in Argumentative Large Language Models
Kevin Zhou
|
Adam Dejl
|
Gabriel Freedman
|
Lihu Chen
|
Antonio Rago
|
Francesca Toni
Findings of the Association for Computational Linguistics: EMNLP 2025
Research in uncertainty quantification (UQ) for large language models (LLMs) is increasingly important towards guaranteeing the reliability of this groundbreaking technology. We explore the integration of LLM UQ methods in argumentative LLMs (ArgLLMs), an explainable LLM framework for decision-making based on computational argumentation in which UQ plays a critical role. We conduct experiments to evaluate ArgLLMs’ performance on claim verification tasks when using different LLM UQ methods, inherently performing an assessment of the UQ methods’ effectiveness. Moreover, the experimental procedure itself is a novel way of evaluating the effectiveness of UQ methods, especially when intricate and potentially contentious statements are present. Our results demonstrate that, despite its simplicity, direct prompting is an effective UQ strategy in ArgLLMs, outperforming considerably more complex approaches.
2019
Hierarchical Attention Prototypical Networks for Few-Shot Text Classification
Shengli Sun
|
Qingfeng Sun
|
Kevin Zhou
|
Tengchao Lv
Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP)
Most of the current effective methods for text classification tasks are based on large-scale labeled data and a great number of parameters, but when the supervised training data are few and difficult to be collected, these models are not available. In this work, we propose a hierarchical attention prototypical networks (HAPN) for few-shot text classification. We design the feature level, word level, and instance level multi cross attention for our model to enhance the expressive ability of semantic space, so it can highlight or weaken the importance of the features, words, and instances separately. We verify the effectiveness of our model on two standard benchmark few-shot text classification datasets—FewRel and CSID, and achieve the state-of-the-art performance. The visualization of hierarchical attention layers illustrates that our model can capture more important features, words, and instances. In addition, our attention mechanism increases support set augmentability and accelerates convergence speed in the training stage.
Search
Fix author
Co-authors
- Lihu Chen 1
- Adam Dejl 1
- Gabriel Freedman 1
- Tengchao Lv 1
- Antonio Rago 1
- show all...