2025
pdf
bib
abs
SweEval: Do LLMs Really Swear? A Safety Benchmark for Testing Limits for Enterprise Use
Hitesh Laxmichand Patel
|
Amit Agarwal
|
Arion Das
|
Bhargava Kumar
|
Srikant Panda
|
Priyaranjan Pattnayak
|
Taki Hasan Rafi
|
Tejaswini Kumar
|
Dong-Kyu Chae
Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 3: Industry Track)
Enterprise customers are increasingly adopting Large Language Models (LLMs) for critical communication tasks, such as drafting emails, crafting sales pitches, and composing casual messages. Deploying such models across different regions requires them to understand diverse cultural and linguistic contexts and generate safe and respectful responses. For enterprise applications, it is crucial to mitigate reputational risks, maintain trust, and ensure compliance by effectively identifying and handling unsafe or offensive language. To address this, we introduce SweEval, a benchmark simulating real-world scenarios with variations in tone (positive or negative) and context (formal or informal). The prompts explicitly instruct the model to include specific swear words while completing the task. This benchmark evaluates whether LLMs comply with or resist such inappropriate instructions and assesses their alignment with ethical frameworks, cultural nuances, and language comprehension capabilities. In order to advance research in building ethically aligned AI systems for enterprise use and beyond, we release the dataset and code: https://github.com/amitbcp/multilingual_profanity.
2024
pdf
bib
abs
Simple Temperature Cool-down in Contrastive Framework for Unsupervised Sentence Representation Learning
Yoo Hyun Jeong
|
Myeong Soo Han
|
Dong-Kyu Chae
Findings of the Association for Computational Linguistics: EACL 2024
In this paper, we proposes a simple, tricky method to improve sentence representation of unsupervised contrastive learning. Even though contrastive learning has achieved great performances in both visual representation learning (VRL) and sentence representation learning (SRL) fields, we focus on the fact that there is a gap between characteristics and training dynamics of VRL and SRL. We first examine the role of temperature to bridge the gap between VRL and SRL, and find some temperature-dependent elements in SRL; i.e., a higher temperature causes overfitting of the uniformity while improving the alignment in earlier phase of training. Then, we design a temperature cool-down technique based on this observation, which helps PLMs to be more suitable for contrastive learning via preparation of uniform representation space. Our experimental results on widely-utilized benchmarks demonstrate the effectiveness and extensiblity of our method.
pdf
bib
abs
Bootstrap Your Own PLM: Boosting Semantic Features of PLMs for Unsuperivsed Contrastive Learning
Yoo Hyun Jeong
|
Myeong Soo Han
|
Dong-Kyu Chae
Findings of the Association for Computational Linguistics: EACL 2024
This paper aims to investigate the possibility of exploiting original semantic features of PLMs (pre-trained language models) during contrastive learning in the context of SRL (sentence representation learning). In the context of feature modification, we identified a method called IFM (implicit feature modification), which reduces the tendency of contrastive models for VRL (visual representation learning) to rely on feature-suppressing shortcut solutions. We observed that IFM did not work well for SRL, which may be due to differences between the nature of VRL and SRL. We propose BYOP, which boosts well-represented features, taking the opposite idea of IFM, under the assumption that SimCSE’s dropout-noise-based augmentation may be too simple to modify high-level semantic features, and that the features learned by PLMs are semantically meaningful and should be boosted, rather than removed. Extensive experiments lend credence to the logic of BYOP, which considers the nature of SRL.
pdf
bib
abs
A Simple Angle-based Approach for Contrastive Learning of Unsupervised Sentence Representation
Yoo Hyun Jeong
|
Myeongsoo Han
|
Dong-Kyu Chae
Findings of the Association for Computational Linguistics: EMNLP 2024
Contrastive learning has been successfully adopted in VRL (visual representation learning) by constructing effective contrastive pairs. A promising baseline SimCSE has made notable breakthroughs in unsupervised SRL (sentence representation learning) following the success of contrastive learning. However, considering the difference between VRL and SRL, there is still room for designing a novel contrastive framework specially targeted for SRL. We pro- pose a novel angle-based similarity function for contrastive objective. By examining the gra- dient of our contrastive objective, we show that an angle-based similarity function incites better training dynamics on SRL than the off-the-shelf cosine similarity: (1) effectively pulling a posi- tive instance toward an anchor instance in the early stage of training and (2) not excessively repelling a false negative instance during the middle of training. Our experimental results on widely-utilized benchmarks demonstrate the ef- fectiveness and extensibility of our novel angle- based approach. Subsequent analyses establish its improved sentence representation power.
pdf
bib
abs
BanglaAutoKG: Automatic Bangla Knowledge Graph Construction with Semantic Neural Graph Filtering
Azmine Toushik Wasi
|
Taki Hasan Rafi
|
Raima Islam
|
Dong-Kyu Chae
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Knowledge Graphs (KGs) have proven essential in information processing and reasoning applications because they link related entities and give context-rich information, supporting efficient information retrieval and knowledge discovery; presenting information flow in a very effective manner. Despite being widely used globally, Bangla is relatively underrepresented in KGs due to a lack of comprehensive datasets, encoders, NER (named entity recognition) models, POS (part-of-speech) taggers, and lemmatizers, hindering efficient information processing and reasoning applications in the language. Addressing the KG scarcity in Bengali, we propose BanglaAutoKG, a pioneering framework that is able to automatically construct Bengali KGs from any Bangla text. We utilize multilingual LLMs to understand various languages and correlate entities and relations universally. By employing a translation dictionary to identify English equivalents and extracting word features from pre-trained BERT models, we construct the foundational KG. To reduce noise and align word embeddings with our goal, we employ graph-based polynomial filters. Lastly, we implement a GNN-based semantic filter, which elevates contextual understanding and trims unnecessary edges, culminating in the formation of the definitive KG. Empirical findings and case studies demonstrate the universal effectiveness of our model, capable of autonomously constructing semantically enriched KGs from any text. Data and code are available here: https://github.com/azminewasi/BanglaAutoKG
pdf
bib
abs
SentiCSE: A Sentiment-aware Contrastive Sentence Embedding Framework with Sentiment-guided Textual Similarity
Jaemin Kim
|
Yohan Na
|
Kangmin Kim
|
Sang-Rak Lee
|
Dong-Kyu Chae
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Recently, sentiment-aware pre-trained language models (PLMs) demonstrate impressive results in downstream sentiment analysis tasks. However, they neglect to evaluate the quality of their constructed sentiment representations; they just focus on improving the fine-tuning performance, which overshadows the representation quality. We argue that without guaranteeing the representation quality, their downstream performance can be highly dependent on the supervision of the fine-tuning data rather than representation quality. This problem would make them difficult to foray into other sentiment-related domains, especially where labeled data is scarce. We first propose Sentiment-guided Textual Similarity (SgTS), a novel metric for evaluating the quality of sentiment representations, which is designed based on the degree of equivalence in sentiment polarity between two sentences. We then propose SentiCSE, a novel Sentiment-aware Contrastive Sentence Embedding framework for constructing sentiment representations via combined word-level and sentence-level objectives, whose quality is guaranteed by SgTS. Qualitative and quantitative comparison with the previous sentiment-aware PLMs shows the superiority of our work. Our code is available at: https://github.com/nayohan/SentiCSE