2024
pdf
abs
Narrowing the Gap between Zero- and Few-shot Machine Translation by Matching Styles
Weiting Tan
|
Haoran Xu
|
Lingfeng Shen
|
Shuyue Stella Li
|
Kenton Murray
|
Philipp Koehn
|
Benjamin Van Durme
|
Yunmo Chen
Findings of the Association for Computational Linguistics: NAACL 2024
Large language models trained primarily in a monolingual setting have demonstrated their ability to generalize to machine translation using zero- and few-shot examples with in-context learning. However, even though zero-shot translations are relatively good, there remains a discernible gap comparing their performance with the few-shot setting. In this paper, we investigate the factors contributing to this gap and find that this gap can largely be closed (for about 70%) by matching the writing styles of the target corpus. Additionally, we explore potential approaches to enhance zero-shot baselines without the need for parallel demonstration examples, providing valuable insights into how these methods contribute to improving translation metrics.
pdf
abs
The Language Barrier: Dissecting Safety Challenges of LLMs in Multilingual Contexts
Lingfeng Shen
|
Weiting Tan
|
Sihao Chen
|
Yunmo Chen
|
Jingyu Zhang
|
Haoran Xu
|
Boyuan Zheng
|
Philipp Koehn
|
Daniel Khashabi
Findings of the Association for Computational Linguistics: ACL 2024
As the influence of large language models (LLMs) spans across global communities, their safety challenges in multilingual settings become paramount for alignment research. This paper examines the variations in safety challenges faced by LLMs across different languages and discusses approaches to alleviating such concerns. By comparing how state-of-the-art LLMs respond to the same set of malicious prompts written in higher- vs. lower-resource languages,we observe that (1) LLMs tend to generate unsafe responses much more often when a malicious prompt is written in a lower-resource language, and (2) LLMs tend to generate more irrelevant responses to malicious prompts in lower-resource languages. To understand where the discrepancy can be attributed, we study the effect of instruction tuning with reinforcement learning from human feedback (RLHF) or supervised finetuning (SFT) on the HH-RLHF dataset. Surprisingly, while training with high-resource languages improves model alignment, training in lower-resource languages yields minimal improvement. This suggests that the bottleneck of cross-lingual alignment is rooted in the pretraining stage. Our findings highlight the challenges in cross-lingual LLM safety, and we hope they inform future research in this direction.
pdf
abs
SemStamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation
Abe Hou
|
Jingyu Zhang
|
Tianxing He
|
Yichen Wang
|
Yung-Sung Chuang
|
Hongwei Wang
|
Lingfeng Shen
|
Benjamin Van Durme
|
Daniel Khashabi
|
Yulia Tsvetkov
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)
Existing watermarked generation algorithms employ token-level designs and therefore, are vulnerable to paraphrase attacks. To address this issue, we introduce watermarking on the semantic representation of sentences. We propose SemStamp, a robust sentence-level semantic watermarking algorithm that uses locality-sensitive hashing (LSH) to partition the semantic space of sentences. The algorithm encodes and LSH-hashes a candidate sentence generated by a language model, and conducts rejection sampling until the sampled sentence falls in watermarked partitions in the semantic embedding space. To test the paraphrastic robustness of watermarking algorithms, we propose a “bigram paraphrase” attack that produces paraphrases with small bigram overlap with the original sentence. This attack is shown to be effective against existing token-level watermark algorithms, while posing only minor degradations to SemStamp. Experimental results show that our novel semantic watermark algorithm is not only more robust than the previous state-of-the-art method on various paraphrasers and domains, but also better at preserving the quality of generation.
2023
pdf
abs
Flatness-Aware Prompt Selection Improves Accuracy and Sample Efficiency
Lingfeng Shen
|
Weiting Tan
|
Boyuan Zheng
|
Daniel Khashabi
Findings of the Association for Computational Linguistics: EMNLP 2023
With growing capabilities of large language models, prompting them has become the dominant way to access them. This has motivated the development of strategies for automatically selecting effective language prompts. In this paper, we introduce **pFlat** (prompt flatness), a new metric to quantify the expected utility of a language prompt. This metric is inspired by *flatness* regularization in statistical learning that quantifies the robustness of the model towards its parameter perturbations. We provide theoretical foundations for this metric and its relationship with other prompt selection metrics, providing a comprehensive understanding of existing methods. Empirically, we show that combining **pFlat** with existing metrics improves both performance and sample efficiency. Our metric outperforms the previous prompt selection metrics with an average increase of 10% in Pearson correlation across 6 classification benchmarks, and the prompt selected by our metric gains 5% higher accuracy than previous metrics across the benchmarks.
pdf
Sen2Pro: A Probabilistic Perspective to Sentence Embedding from Pre-trained Language Model
Lingfeng Shen
|
Haiyun Jiang
|
Lemao Liu
|
Shuming Shi
Proceedings of the 8th Workshop on Representation Learning for NLP (RepL4NLP 2023)
2022
pdf
abs
On the Evaluation Metrics for Paraphrase Generation
Lingfeng Shen
|
Lemao Liu
|
Haiyun Jiang
|
Shuming Shi
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing
In this paper we revisit automatic metrics for paraphrase evaluation and obtain two findings that disobey conventional wisdom: (1) Reference-free metrics achieve better performance than their reference-based counterparts. (2) Most commonly used metrics do not align well with human annotation.Underlying reasons behind the above findings are explored through additional experiments and in-depth analyses.Based on the experiments and analyses, we propose ParaScore, a new evaluation metric for paraphrase generation. It possesses the merits of reference-based and reference-free metrics and explicitly models lexical divergence. Based on our analysis and improvements, our proposed reference-based outperforms than reference-free metrics.Experimental results demonstrate that ParaScore significantly outperforms existing metrics.