Jihoon Kim
2023
X-SNS: Cross-Lingual Transfer Prediction through Sub-Network Similarity
Taejun Yun
|
Jinhyeon Kim
|
Deokyeong Kang
|
Seonghoon Lim
|
Jihoon Kim
|
Taeuk Kim
Findings of the Association for Computational Linguistics: EMNLP 2023
Cross-lingual transfer (XLT) is an emergent ability of multilingual language models that preserves their performance on a task to a significant extent when evaluated in languages that were not included in the fine-tuning process. While English, due to its widespread usage, is typically regarded as the primary language for model adaption in various tasks, recent studies have revealed that the efficacy of XLT can be amplified by selecting the most appropriate source languages based on specific conditions. In this work, we propose the utilization of sub-network similarity between two languages as a proxy for predicting the compatibility of the languages in the context of XLT. Our approach is model-oriented, better reflecting the inner workings of foundation models. In addition, it requires only a moderate amount of raw text from candidate languages, distinguishing it from the majority of previous methods that rely on external resources. In experiments, we demonstrate that our method is more effective than baselines across diverse tasks. Specifically, it shows proficiency in ranking candidates for zero-shot XLT, achieving an improvement of 4.6% on average in terms of NDCG@3. We also provide extensive analyses that confirm the utility of sub-networks for XLT prediction.
2019
Summary Level Training of Sentence Rewriting for Abstractive Summarization
Sanghwan Bae
|
Taeuk Kim
|
Jihoon Kim
|
Sang-goo Lee
Proceedings of the 2nd Workshop on New Frontiers in Summarization
As an attempt to combine extractive and abstractive summarization, Sentence Rewriting models adopt the strategy of extracting salient sentences from a document first and then paraphrasing the selected ones to generate a summary. However, the existing models in this framework mostly rely on sentence-level rewards or suboptimal labels, causing a mismatch between a training objective and evaluation metric. In this paper, we present a novel training signal that directly maximizes summary-level ROUGE scores through reinforcement learning. In addition, we incorporate BERT into our model, making good use of its ability on natural language understanding. In extensive experiments, we show that a combination of our proposed model and training procedure obtains new state-of-the-art performance on both CNN/Daily Mail and New York Times datasets. We also demonstrate that it generalizes better on DUC-2002 test set.
Search
Co-authors
- Taeuk Kim 2
- Taejun Yun 1
- Jinhyeon Kim 1
- Deokyeong Kang 1
- Seonghoon Lim 1
- show all...