Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification

Zeren Shui; Petros Karypis; Daniel S. Karls; Mingjian Wen; Saurav Manchanda; Ellad B. Tadmor; George Karypis

doi:10.18653/v1/2024.findings-emnlp.974

Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification

Zeren Shui, Petros Karypis, Daniel S. Karls, Mingjian Wen, Saurav Manchanda, Ellad B. Tadmor, George Karypis

Abstract

Citation intention Classification (CIC) tools classify citations by their intention (e.g., background, motivation) and assist readers in evaluating the contribution of scientific literature. Prior research has shown that pretrained language models (PLMs) such as SciBERT can achieve state-of-the-art performance on CIC benchmarks. PLMs are trained via self-supervision tasks on a large corpus of general text and can quickly adapt to CIC tasks via moderate fine-tuning on the corresponding dataset. Despite their advantages, PLMs can easily overfit small datasets during fine-tuning. In this paper, we propose a multi-task learning (MTL) framework that jointly fine-tunes PLMs on a dataset of primary interest together with multiple auxiliary CIC datasets to take advantage of additional supervision signals. We develop a data-driven task relation learning (TRL) method that controls the contribution of auxiliary datasets to avoid negative transfer and expensive hyper-parameter tuning. We conduct experiments on three CIC datasets and show that fine-tuning with additional datasets can improve the PLMs’ generalization performance on the primary dataset. PLMs fine-tuned with our proposed framework outperform the current state-of-the-art models by 7% to 11% on small datasets while aligning with the best-performing model on a large dataset.

Anthology ID:: 2024.findings-emnlp.974
Volume:: Findings of the Association for Computational Linguistics: EMNLP 2024
Month:: November
Year:: 2024
Address:: Miami, Florida, USA
Editors:: Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:: Findings
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 16718–16732
Language:
URL:: https://preview.aclanthology.org/icon-24-ingestion/2024.findings-emnlp.974/
DOI:: 10.18653/v1/2024.findings-emnlp.974
Bibkey:
Cite (ACL):: Zeren Shui, Petros Karypis, Daniel S. Karls, Mingjian Wen, Saurav Manchanda, Ellad B. Tadmor, and George Karypis. 2024. Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 16718–16732, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):: Fine-Tuning Language Models on Multiple Datasets for Citation Intention Classification (Shui et al., Findings 2024)
Copy Citation:
PDF:: https://preview.aclanthology.org/icon-24-ingestion/2024.findings-emnlp.974.pdf

PDF Search Fix data