Wolfies at SemEval-2022 Task 8: Feature extraction pipeline with transformers for Multi-lingual news article similarity

Nikhil Goel, Ranjith Reddy Bommidi


Abstract
This work is about finding the similarity between a pair of news articles. There are seven different objective similarity metrics provided in the dataset for each pair and the news articles are in multiple different languages. On top of the pre-trained embedding model, we calculated cosine similarity for baseline results and feed-forward neural network was then trained on top of it to improve the results. We also built separate pipelines for each similarity metric for feature extraction. We could see significant improvement from baseline results using feature extraction and feed-forward neural network.
Anthology ID:
2022.semeval-1.159
Original:
2022.semeval-1.159v1
Version 2:
2022.semeval-1.159v2
Volume:
Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022)
Month:
July
Year:
2022
Address:
Seattle, United States
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
1129–1135
Language:
URL:
https://aclanthology.org/2022.semeval-1.159
DOI:
10.18653/v1/2022.semeval-1.159
Bibkey:
Cite (ACL):
Nikhil Goel and Ranjith Reddy Bommidi. 2022. Wolfies at SemEval-2022 Task 8: Feature extraction pipeline with transformers for Multi-lingual news article similarity. In Proceedings of the 16th International Workshop on Semantic Evaluation (SemEval-2022), pages 1129–1135, Seattle, United States. Association for Computational Linguistics.
Cite (Informal):
Wolfies at SemEval-2022 Task 8: Feature extraction pipeline with transformers for Multi-lingual news article similarity (Goel & Bommidi, SemEval 2022)
Copy Citation:
PDF:
https://preview.aclanthology.org/remove-xml-comments/2022.semeval-1.159.pdf
Video:
 https://preview.aclanthology.org/remove-xml-comments/2022.semeval-1.159.mp4