Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection

Viresh Gupta; Baani Leen Kaur Jolly; Ramneek Kaur; Tanmoy Chakraborty

doi:10.18653/v1/S19-2159

Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection

Viresh Gupta, Baani Leen Kaur Jolly, Ramneek Kaur, Tanmoy Chakraborty

Abstract

In this paper, we present a news bias prediction system, which we developed as part of a SemEval 2019 task. We developed an XGBoost based system which uses character and word level n-gram features represented using TF-IDF, count vector based correlation matrix, and predicts if an input news article is a hyperpartisan news article. Our model was able to achieve a precision of 68.3% on the test set provided by the contest organizers. We also run our model on the BuzzFeed corpus and find XGBoost with simple character level N-Gram embeddings to be performing well with an accuracy of around 96%.

Anthology ID:: S19-2159
Volume:: Proceedings of the 13th International Workshop on Semantic Evaluation
Month:: June
Year:: 2019
Address:: Minneapolis, Minnesota, USA
Venue:: SemEval
SIG:: SIGLEX
Publisher:: Association for Computational Linguistics
Note:
Pages:: 934–938
Language:
URL:: https://aclanthology.org/S19-2159
DOI:: 10.18653/v1/S19-2159
Bibkey:
Cite (ACL):: Viresh Gupta, Baani Leen Kaur Jolly, Ramneek Kaur, and Tanmoy Chakraborty. 2019. Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 934–938, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
Cite (Informal):: Clark Kent at SemEval-2019 Task 4: Stylometric Insights into Hyperpartisan News Detection (Gupta et al., SemEval 2019)
Copy Citation:
PDF:: https://preview.aclanthology.org/ingestion-script-update/S19-2159.pdf

PDF Search