2024
pdf
abs
Leveraging LLM Reasoning Enhances Personalized Recommender Systems
Alicia Tsai
|
Adam Kraft
|
Long Jin
|
Chenwei Cai
|
Anahita Hosseini
|
Taibai Xu
|
Zemin Zhang
|
Lichan Hong
|
Ed Chi
|
Xinyang Yi
Findings of the Association for Computational Linguistics ACL 2024
Recent advancements have showcased the potential of Large Language Models (LLMs) in executing reasoning tasks, particularly facilitated by Chain-of-Thought (CoT) prompting. While tasks like arithmetic reasoning involve clear, definitive answers and logical chains of thought, the application of LLM reasoning in recommendation systems (RecSys) presents a distinct challenge. RecSys tasks revolve around subjectivity and personalized preferences, an under-explored domain in utilizing LLMs’ reasoning capabilities. Our study explores several aspects to better understand reasoning for RecSys and demonstrate how task quality improves by utilizing LLM reasoning for both zero-shot and fine-tuning settings. Additionally, we propose Rec-SAVER (Recommender Systems Automatic Verification and Evaluation of Reasoning) to automatically assess the quality of LLM reasoning responses without the requirement of curated gold references or human raters. We show that our framework aligns with real human judgment on the coherence and faithfulness of reasoning responses. Overall, our work shows that incorporating reasoning into RecSys can improve personalized tasks, paving the way for further advancements in recommender system methodologies.
2021
pdf
abs
Style Control for Schema-Guided Natural Language Generation
Alicia Tsai
|
Shereen Oraby
|
Vittorio Perera
|
Jiun-Yu Kao
|
Yuheng Du
|
Anjali Narayan-Chen
|
Tagyoung Chung
|
Dilek Hakkani-Tur
Proceedings of the 3rd Workshop on Natural Language Processing for Conversational AI
Natural Language Generation (NLG) for task-oriented dialogue systems focuses on communicating specific content accurately, fluently, and coherently. While these attributes are crucial for a successful dialogue, it is also desirable to simultaneously accomplish specific stylistic goals, such as response length, point-of-view, descriptiveness, sentiment, formality, and empathy. In this work, we focus on stylistic control and evaluation for schema-guided NLG, with joint goals of achieving both semantic and stylistic control. We experiment in detail with various controlled generation methods for large pretrained language models: specifically, conditional training, guided fine-tuning, and guided decoding. We discuss their advantages and limitations, and evaluate them with a broad range of automatic and human evaluation metrics. Our results show that while high style accuracy and semantic correctness are easier to achieve for more lexically-defined styles with conditional training, stylistic control is also achievable for more semantically complex styles using discriminator-based guided decoding methods. The results also suggest that methods that are more scalable (with less hyper-parameters tuning) and that disentangle context generation and stylistic variations are more effective at achieving semantic correctness and style accuracy.
pdf
bib
Proceedings of the Fifth Workshop on Widening Natural Language Processing
Erika Varis
|
Ryan Georgi
|
Alicia Tsai
|
Antonios Anastasopoulos
|
Kyathi Chandu
|
Xanda Schofield
|
Surangika Ranathunga
|
Haley Lepp
|
Tirthankar Ghosal
Proceedings of the Fifth Workshop on Widening Natural Language Processing
2020
pdf
abs
Sparse Optimization for Unsupervised Extractive Summarization of Long Documents with the Frank-Wolfe Algorithm
Alicia Tsai
|
Laurent El Ghaoui
Proceedings of SustaiNLP: Workshop on Simple and Efficient Natural Language Processing
We address the problem of unsupervised extractive document summarization, especially for long documents. We model the unsupervised problem as a sparse auto-regression one and approximate the resulting combinatorial problem via a convex, norm-constrained problem. We solve it using a dedicated Frank-Wolfe algorithm. To generate a summary with k sentences, the algorithm only needs to execute approximately k iterations, making it very efficient for a long document. We evaluate our approach against two other unsupervised methods using both lexical (standard) ROUGE scores, as well as semantic (embedding-based) ones. Our method achieves better results with both datasets and works especially well when combined with embeddings for highly paraphrased summaries.
pdf
bib
Proceedings of the Fourth Widening Natural Language Processing Workshop
Rossana Cunha
|
Samira Shaikh
|
Erika Varis
|
Ryan Georgi
|
Alicia Tsai
|
Antonios Anastasopoulos
|
Khyathi Raghavi Chandu
Proceedings of the Fourth Widening Natural Language Processing Workshop