Abstract
This paper reports an experiment carried out to investigate the relevance of several syntactic, stylistic and pragmatic features on the task of distinguishing between mainstream and partisan news articles. The results of the evaluation of different feature sets and the extent to which various feature categories could affect the performance metrics are discussed and compared. Among different combinations of features and classifiers, Random Forest classifier using vector representations of the headline and the text of the report, with the inclusion of 8 readability scores and few stylistic features yielded best result, ranking our team at the 9th place at the SemEval 2019 Hyperpartisan News Detection challenge.- Anthology ID:
- S19-2179
- Volume:
- Proceedings of the 13th International Workshop on Semantic Evaluation
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota, USA
- Venue:
- SemEval
- SIG:
- SIGLEX
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1026–1031
- Language:
- URL:
- https://aclanthology.org/S19-2179
- DOI:
- 10.18653/v1/S19-2179
- Cite (ACL):
- Bozhidar Stevanoski and Sonja Gievska. 2019. Team Ned Leeds at SemEval-2019 Task 4: Exploring Language Indicators of Hyperpartisan Reporting. In Proceedings of the 13th International Workshop on Semantic Evaluation, pages 1026–1031, Minneapolis, Minnesota, USA. Association for Computational Linguistics.
- Cite (Informal):
- Team Ned Leeds at SemEval-2019 Task 4: Exploring Language Indicators of Hyperpartisan Reporting (Stevanoski & Gievska, SemEval 2019)
- PDF:
- https://preview.aclanthology.org/nodalida-main-page/S19-2179.pdf