Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection

Viresh Gupta, Baani Leen Kaur Jolly, Ramneek Kaur, Tanmoy Chakraborty


Abstract
In this paper, we present a news bias prediction system, which we developed as part of a SemEval 2019 task. We developed an XGBoost based system which uses character and word level n-gram features represented using TF-IDF, count vector based correlation matrix, and predicts if an input news article is a hyperpartisan news article. Our model was able to achieve a precision of 68.3% on the test set provided by the contest organizers. We also run our model on the BuzzFeed corpus and find XGBoost with simple character level N-Gram embeddings to be performing well with an accuracy of around 96%.
Anthology ID:
S19-2159
Volume:
Proceedings of the 13th International Workshop on Semantic Evaluation
Month:
June
Year:
2019
Address:
Minneapolis, Minnesota, USA
Venue:
SemEval
SIG:
SIGLEX
Publisher:
Association for Computational Linguistics
Note:
Pages:
934–938
Language:
URL:
https://aclanthology.org/S19-2159
DOI:
10.18653/v1/S19-2159
Bibkey:
Cite (ACL):
Viresh Gupta, Baani Leen Kaur Jolly, Ramneek Kaur, and Tanmoy Chakraborty. 2019. Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 934–938, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):
Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection (Gupta et al., SemEval 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/S19-2159.pdf