Sheng-Lun Wei


2024

pdf
Unveiling Selection Biases: Exploring Order and Token Sensitivity in Large Language Models
Sheng-Lun Wei | Cheng-Kuang Wu | Hen-Hsen Huang | Hsin-Hsi Chen
Findings of the Association for Computational Linguistics ACL 2024

In this paper, we investigate the phenomena of “selection biases” in Large Language Models (LLMs), focusing on problems where models are tasked with choosing the optimal option from an ordered sequence. We delve into biases related to option order and token usage, which significantly impact LLMs’ decision-making processes. We also quantify the impact of these biases through an extensive empirical analysis across multiple models and tasks. Furthermore, we propose mitigation strategies to enhance model performance. Our key contributions are threefold: 1) Precisely quantifying the influence of option order and token on LLMs, 2) Developing strategies to mitigate the impact of token and order sensitivity to enhance robustness, and 3) Offering a detailed analysis of sensitivity across models and tasks, which informs the creation of more stable and reliable LLM applications for selection problems.

2016

pdf
NL2KB: Resolving Vocabulary Gap between Natural Language and Knowledge Base in Knowledge Base Construction and Retrieval
Sheng-Lun Wei | Yen-Pin Chiu | Hen-Hsen Huang | Hsin-Hsi Chen
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: System Demonstrations

Words to express relations in natural language (NL) statements may be different from those to represent properties in knowledge bases (KB). The vocabulary gap becomes barriers for knowledge base construction and retrieval. With the demo system called NL2KB in this paper, users can browse which properties in KB side may be mapped to for a given relational pattern in NL side. Besides, they can retrieve the sets of relational patterns in NL side for a given property in KB side. We describe how the mapping is established in detail. Although the mined patterns are used for Chinese knowledge base applications, the methodology can be extended to other languages.