LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM

Yves Bestgen


Abstract
To select tokens to be emphasised in short texts, a system mainly based on precomputed embedding models, such as BERT and ELMo, and LightGBM is proposed. Its performance is low. Additional analyzes suggest that its effectiveness is poor at predicting the highest emphasis scores while they are the most important for the challenge and that it is very sensitive to the specific instances provided during learning.
Anthology ID:
2020.semeval-1.218
Volume:
Proceedings of the Fourteenth Workshop on Semantic Evaluation
Month:
December
Year:
2020
Address:
Barcelona (online)
Venue:
SemEval
SIG:
SIGLEX
Publisher:
International Committee for Computational Linguistics
Note:
Pages:
1671–1677
Language:
URL:
https://aclanthology.org/2020.semeval-1.218
DOI:
10.18653/v1/2020.semeval-1.218
Bibkey:
Cite (ACL):
Yves Bestgen. 2020. LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1671–1677, Barcelona (online). International Committee for Computational Linguistics.
Cite (Informal):
LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM (Bestgen, SemEval 2020)
Copy Citation:
PDF:
https://preview.aclanthology.org/starsem-semeval-split/2020.semeval-1.218.pdf