This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
RamYazdi
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
The rise of LLMs has deflected a growing portion of human-computer interactions towards LLM-based chatbots.The remarkable abilities of these models allow users to interact using long, diverse natural language text covering a wide range of topics and styles. Phrasing these messages is a time and effort consuming task, calling for an autocomplete solution to assist users. We present **ChaI-TeA**: **Cha**t **I**n**te**raction **A**utocomplete; An autocomplete evaluation framework for LLM-based chatbot interactions. The framework includes a formal definition of the task, curated datasets and suitable metrics. We use it to evaluate 11 models on this task, finding that while current off-the-shelf models perform fairly, there is still much room for improvement, mainly in ranking of the generated suggestions. We provide insights for practitioners working on this task and open new research directions for researchers in the field. We release our framework to serve as a foundation for future research.
When customers search online for a product they are not familiar with, their needs are often expressed through subjective product attributes, such as ”picture quality” for a TV or ”easy to clean” for a sofa. In contrast, the product catalog in online stores includes objective attributes such as ”screen resolution” or ”material”. In this work, we aim to find a link between the objective product catalog and the subjective needs of the customers, to help customers better understand the product space using their own words. We apply correlation-based methods to the store’s product catalog and product reviews in order to find the best potential links between objective and subjective attributes; next, Large Language Models (LLMs) reduce spurious correlations by incorporating common sense and world knowledge (e.g., picture quality is indeed affected by screen resolution, and 8k is the best one). We curate a dataset for this task and show that our combined approach outperforms correlation-only and causation-only approaches.
The best solution of structured prediction models in NLP is often inaccurate because of limited expressive power of the model or to non-exact parameter estimation. One way to mitigate this problem is sampling candidate solutions from the model’s solution space, reasoning that effective exploration of this space should yield high-quality solutions. Unfortunately, sampling is often computationally hard and many works hence back-off to sub-optimal strategies, such as extraction of the best scoring solutions of the model, which are not as diverse as sampled solutions. In this paper we propose a perturbation-based approach where sampling from a probabilistic model is computationally efficient. We present a learning algorithm for the variance of the perturbations, and empirically demonstrate its importance. Moreover, while finding the argmax in our model is intractable, we propose an efficient and effective approximation. We apply our framework to cross-lingual dependency parsing across 72 corpora from 42 languages and to lightly supervised dependency parsing across 13 corpora from 12 languages, and demonstrate strong results in terms of both the quality of the entire solution list and of the final solution.1