Joongwon Kim
2023
TaskWeb: Selecting Better Source Tasks for Multi-task NLP
Joongwon Kim
|
Akari Asai
|
Gabriel Ilharco
|
Hannaneh Hajishirzi
Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing
Recent work in NLP has shown promising results in training models on large amounts of tasks to achieve better generalization. However, it is not well-understood how tasks are related, and how helpful training tasks can be chosen for a new task. In this work, we investigate whether knowing task relationships via pairwise task transfer improves choosing one or more source tasks that help to learn a new target task. We provide TaskWeb, a large-scale benchmark of pairwise task transfers for 22 NLP tasks using three different model types, sizes, and adaptation methods, spanning about 25,000 experiments. Then, we design a new method TaskShop based on our analysis of TaskWeb. TaskShop uses TaskWeb to estimate the benefit of using a source task for learning a new target task, and to choose a subset of helpful training tasks for multi-task training. Our method improves overall rankings and top-k precision of source tasks by 10% and 38%, respectively. We also use TaskShop to build much smaller multi-task training sets that improve zero-shot performances across 11 different target tasks by at least 4.3%.
2021
BiSECT: Learning to Split and Rephrase Sentences with Bitexts
Joongwon Kim
|
Mounica Maddela
|
Reno Kriz
|
Wei Xu
|
Chris Callison-Burch
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing
An important task in NLP applications such as sentence simplification is the ability to take a long, complex sentence and split it into shorter sentences, rephrasing as necessary. We introduce a novel dataset and a new model for this ‘split and rephrase’ task. Our BiSECT training data consists of 1 million long English sentences paired with shorter, meaning-equivalent English sentences. We obtain these by extracting 1-2 sentence alignments in bilingual parallel corpora and then using machine translation to convert both sides of the corpus into the same language. BiSECT contains higher quality training examples than the previous Split and Rephrase corpora, with sentence splits that require more significant modifications. We categorize examples in our corpus and use these categories in a novel model that allows us to target specific regions of the input sentence to be split and edited. Moreover, we show that models trained on BiSECT can perform a wider variety of split operations and improve upon previous state-of-the-art approaches in automatic and human evaluations.
Search
Co-authors
- Akari Asai 1
- Gabriel Ilharco 1
- Hannaneh Hajishirzi 1
- Mounica Maddela 1
- Reno Kriz 1
- show all...