Active Learning for Interactive Relation Extraction in a French Newspaper’s Articles
Cyrielle Mallart, Michel Le Nouy, Guillaume Gravier, Pascale Sébillot
Abstract
Relation extraction is a subtask of natural langage processing that has seen many improvements in recent years, with the advent of complex pre-trained architectures. Many of these state-of-the-art approaches are tested against benchmarks with labelled sentences containing tagged entities, and require important pre-training and fine-tuning on task-specific data. However, in a real use-case scenario such as in a newspaper company mostly dedicated to local information, relations are of varied, highly specific type, with virtually no annotated data for such relations, and many entities co-occur in a sentence without being related. We question the use of supervised state-of-the-art models in such a context, where resources such as time, computing power and human annotators are limited. To adapt to these constraints, we experiment with an active-learning based relation extraction pipeline, consisting of a binary LSTM-based lightweight model for detecting the relations that do exist, and a state-of-the-art model for relation classification. We compare several choices for classification models in this scenario, from basic word embedding averaging, to graph neural networks and Bert-based ones, as well as several active learning acquisition strategies, in order to find the most cost-efficient yet accurate approach in our French largest daily newspaper company’s use case.- Anthology ID:
- 2021.ranlp-1.101
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)
- Month:
- September
- Year:
- 2021
- Address:
- Held Online
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 886–894
- Language:
- URL:
- https://aclanthology.org/2021.ranlp-1.101
- DOI:
- Cite (ACL):
- Cyrielle Mallart, Michel Le Nouy, Guillaume Gravier, and Pascale Sébillot. 2021. Active Learning for Interactive Relation Extraction in a French Newspaper’s Articles. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021), pages 886–894, Held Online. INCOMA Ltd..
- Cite (Informal):
- Active Learning for Interactive Relation Extraction in a French Newspaper’s Articles (Mallart et al., RANLP 2021)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2021.ranlp-1.101.pdf