Xiaoguang Qi


2024

pdf bib
APE: Active Learning-based Tooling for Finding Informative Few-shot Examples for LLM-based Entity Matching
Kun Qian | Yisi Sang | Farima Bayat† | Anton Belyi | Xianqi Chu | Yash Govind | Samira Khorshidi | Rahul Khot | Katherine Luna | Azadeh Nikfarjam | Xiaoguang Qi | Fei Wu | Xianhan Zhang | Yunyao Li
Proceedings of the Fifth Workshop on Data Science with Human-in-the-Loop (DaSH 2024)

Prompt engineering is an iterative procedure that often requires extensive manual effort to formulate suitable instructions for effectively directing large language models (LLMs) in specific tasks. Incorporating few-shot examples is a vital and effective approach to provide LLMs with precise instructions, leading to improved LLM performance. Nonetheless, identifying the most informative demonstrations for LLMs is labor-intensive, frequently entailing sifting through an extensive search space. In this demonstration, we showcase a human-in-the-loop tool called ool (Active Prompt Engineering) designed for refining prompts through active learning. Drawing inspiration from active learning, ool iteratively selects the most ambiguous examples for human feedback, which will be transformed into few-shot examples within the prompt.

2022

pdf
Improving Human Annotation Effectiveness for Fact Collection by Identifying the Most Relevant Answers
Pranav Kamath | Yiwen Sun | Thomas Semere | Adam Green | Scott Manley | Xiaoguang Qi | Kun Qian | Yunyao Li
Proceedings of the Fourth Workshop on Data Science with Human-in-the-Loop (Language Advances)

Identifying and integrating missing facts is a crucial task for knowledge graph completion to ensure robustness towards downstream applications such as question answering. Adding new facts for a knowledge graph in real world system often involves human verification effort, where candidate facts are verified for accuracy by human annotators. This process is labor-intensive, time-consuming, and inefficient since only a small number of missing facts can be identified. This paper proposes a simple but effective human-in-the-loop framework for fact collection that searches for a diverse set of highly relevant candidate facts for human annotation. Empirical results presented in this work demonstrate that the proposed solution leads to both improvements in i) the quality of the candidate facts as well as ii) the ability of discovering more facts to grow the knowledge graph without requiring additional human effort.