Abstract
To select tokens to be emphasised in short texts, a system mainly based on precomputed embedding models, such as BERT and ELMo, and LightGBM is proposed. Its performance is low. Additional analyzes suggest that its effectiveness is poor at predicting the highest emphasis scores while they are the most important for the challenge and that it is very sensitive to the specific instances provided during learning.- Anthology ID:
- 2020.semeval-1.218
- Volume:
- Proceedings of the Fourteenth Workshop on Semantic Evaluation
- Month:
- December
- Year:
- 2020
- Address:
- Barcelona (online)
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- International Committee for Computational Linguistics
- Note:
- Pages:
- 1671–1677
- Language:
- URL:
- https://aclanthology.org/2020.semeval-1.218
- DOI:
- 10.18653/v1/2020.semeval-1.218
- Cite (ACL):
- Yves Bestgen. 2020. LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM. In Proceedings of the Fourteenth Workshop on Semantic Evaluation, pages 1671–1677, Barcelona (online). International Committee for Computational Linguistics.
- Cite (Informal):
- LAST at SemEval-2020 Task 10: Finding Tokens to Emphasise in Short Written Texts with Precomputed Embedding Models and LightGBM (Bestgen, SemEval 2020)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/2020.semeval-1.218.pdf