This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
SeokyoungHong
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Intent discovery in task-oriented dialogue is typically cast as single-turn intent classification, leaving systems brittle when user goals fall outside predefined inventories. We reformulate the task as multi-turn zero-shot intent discovery and present KSTC, a framework that (i) embeds dialogue contexts, (ii) performs coarse clustering, (iii) generates predicted theme label for each cluster, (iv) refines clusters using the Large Language Model (LLM) using predicted theme label, and (v) relocates utterances according to user’s preference. Because generating informative predicted theme label is crucial during the LLM-driven cluster refinement process, we propose the Task Independent Slots (TIS), which generates effective theme label by extracting verb and noun slot–value.Evaluated on DSTC12 Track2 dataset, KSTC took the first place, improving clustering and labeling quality without in-domain supervision. Results show that leveraging conversational context and slot-guided LLM labeling yields domain-agnostic theme clusters that remain consistent under distributional shift. KSTC thus offers a scalable, label-free solution for real-world dialogue systems that must continuously surface novel user intents. We will release our code and prompts publicly.
As Large Language Models (LLMs) evolve into powerful agentic systems, the telecommunications industry’s expansion into AI services necessitates industry-grounded benchmarks to evaluate their underexplored domain-specific capabilities. To address the gap left by generic benchmarks that fail to assess realistic, non-English performance, we present TelAgentBench, a Korean benchmark for the telecommunications domain evaluating five core agentic capabilities: Reasoning, Planning, Action (tool-use), Retrieval-Augmented Generation, and Instruction Following. Evaluations reveal significant performance disparities between models that employ explicit reasoning and those that do not, providing actionable insights for deploying agentic LLMs in real-world telecommunications tasks.
The telecommunications industry, characterized by its vast customer base and complex service offerings, necessitates a high level of domain expertise and proficiency in customer service center operations. Consequently, there is a growing demand for Large Language Models (LLMs) to augment the capabilities of customer service representatives. This paper introduces a methodology for developing a specialized Telecommunications LLM (Telco LLM) designed to enhance the efficiency of customer service agents and promote consistency in service quality across representatives. We present the construction process of TelBench, a novel dataset created for performance evaluation of customer service expertise in the telecommunications domain. We also evaluate various LLMs and demonstrate the ability to benchmark both proprietary and open-source LLMs on predefined telecommunications-related tasks, thereby establishing metrics that define telcommunications performance.