2024
pdf
abs
P-TA: Using Proximal Policy Optimization to Enhance Tabular Data Augmentation via Large Language Models
Shuo Yang
|
Chenchen Yuan
|
Yao Rong
|
Felix Steinbauer
|
Gjergji Kasneci
Findings of the Association for Computational Linguistics ACL 2024
A multitude of industries depend on accurate and reasonable tabular data augmentation for their business processes. Contemporary methodologies in generating tabular data revolve around utilizing Generative Adversarial Networks (GAN) or fine-tuning Large Language Models (LLM). However, GAN-based approaches are documented to produce samples with common-sense errors attributed to the absence of external knowledge. On the other hand, LLM-based methods exhibit a limited capacity to capture the disparities between synthesized and actual data distribution due to the absence of feedback from a discriminator during training. Furthermore, the decoding of LLM-based generation introduces gradient breakpoints, impeding the backpropagation of loss from a discriminator, thereby complicating the integration of these two approaches. To solve this challenge, we propose using proximal policy optimization (PPO) to apply GANs, guiding LLMs to enhance the probability distribution of tabular features. This approach enables the utilization of LLMs as generators for GANs in synthesizing tabular data. Our experiments demonstrate that PPO leads to an approximately 4% improvement in the accuracy of models trained on synthetically generated data over state-of-the-art across three real-world datasets.
pdf
abs
A Trusted Multi-View Evidential Fusion Framework for Commonsense Reasoning
Shuo Yang
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
While deep learning models are powerful, they have limitations in tasks that require commonsense reasoning, as these tasks often involve interpreting information that may not be directly available in the input. Providing evidence has been proven to significantly enhance performance in commonsense reasoning tasks. However, there are various perspectives on evidence, including natural language explanations generated by pre-trained language models, facts derived from world knowledge like text corpora and knowledge bases, and rationales extracted from the input context. Hence, it is crucial to determine how to estimate the confidence degree of different evidence and how to combine them reliably. To address these challenges, this study proposes a trusted multi-view evidential fusion framework for reliable commonsense reasoning tasks that dynamically assesses the confidence of evidence and combines different views of evidence in a trustworthy manner. The proposed method is applied to three commonsense question-answering benchmarks, demonstrating that this approach can effectively reason with multi-view evidence and can compete with state-of-the-art performance.
pdf
abs
Is Crowdsourcing Breaking Your Bank? Cost-Effective Fine-Tuning of Pre-trained Language Models with Proximal Policy Optimization
Shuo Yang
|
Gjergji Kasneci
Proceedings of the 2024 Joint International Conference on Computational Linguistics, Language Resources and Evaluation (LREC-COLING 2024)
Wide usage of ChatGPT has highlighted the potential of reinforcement learning from human feedback. However, its training pipeline relies on manual ranking, a resource-intensive process. To reduce labor costs, we propose a self-supervised text ranking approach for applying Proximal-Policy-Optimization to fine-tune language models while eliminating the need for human annotators. Our method begins with probabilistic sampling to encourage a language model to generate diverse responses for each input. We then employ TextRank and ISODATA algorithms to rank and cluster these responses based on their semantics. Subsequently, we construct a reward model to learn the rank and optimize our generative policy. Our experimental results, conducted using two language models on three tasks, demonstrate that the models trained by our method considerably outperform baselines regarding BLEU, GLEU, and METEOR scores. Furthermore, our manual evaluation shows that our ranking results exhibit a remarkably high consistency with that of humans. This research significantly reduces training costs of proximal policy-guided models and demonstrates the potential for self-correction of language models.
2022
pdf
abs
MirrorAlign: A Super Lightweight Unsupervised Word Alignment Model via Cross-Lingual Contrastive Learning
Di Wu
|
Liang Ding
|
Shuo Yang
|
Mingyang Li
Proceedings of the 19th International Conference on Spoken Language Translation (IWSLT 2022)
Word alignment is essential for the downstream cross-lingual language understanding and generation tasks. Recently, the performance of the neural word alignment models has exceeded that of statistical models. However, they heavily rely on sophisticated translation models. In this study, we propose a super lightweight unsupervised word alignment model named MirrorAlign, in which bidirectional symmetric attention trained with a contrastive learning objective is introduced, and an agreement loss is employed to bind the attention maps, such that the alignments follow mirror-like symmetry hypothesis. Experimental results on several public benchmarks demonstrate that our model achieves competitive, if not better, performance compared to the state of the art in word alignment while significantly reducing the training and decoding time on average. Further ablation analysis and case studies show the superiority of our proposed MirrorAlign. Notably, we recognize our model as a pioneer attempt to unify bilingual word embedding and word alignments. Encouragingly, our approach achieves 16.4X speedup against GIZA++, and 50X parameter compression compared with the Transformer-based alignment methods. We release our code to facilitate the community:
https://github.com/moore3930/MirrorAlign.
pdf
bib
abs
Mask and Regenerate: A Classifier-based Approach for Unpaired Sentiment Transformation of Reviews for Electronic Commerce Websites.
Shuo Yang
Proceedings of the Tenth International Workshop on Natural Language Processing for Social Media
Style transfer is the task of transferring a sentence into the target style while keeping its content. The major challenge is that parallel corpora are not available for various domains. In this paper, we propose a Mask-And-Regenerate approach (MAR). It learns from unpaired sentences by modifying the word-level style attributes. We cautiously integrate the deletion, insertion and substitution operations into our model. This enables our model to automatically apply different edit operations for different sentences. Specifically, we train a multilayer perceptron (MLP) as a style classifier to find out and mask style-characteristic words in the source inputs. Then we learn a language model on non-parallel data sets to score sentences and remove unnecessary masks. Finally, the masked source sentences are input to a Transformer to perform style transfer. The final results show that our proposed model exceeds baselines by about 2 per cent of accuracy for both sentiment and style transfer tasks with comparable or better content retention.