Anastasios Bairaktaris


2020

pdf bib
DUTH at SemEval-2020 Task 11: BERT with Entity Mapping for Propaganda Classification
Anastasios Bairaktaris | Symeon Symeonidis | Avi Arampatzis
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This report describes the methods employed by the Democritus University of Thrace (DUTH) team for participating in SemEval-2020 Task 11: Detection of Propaganda Techniques in News Articles. Our team dealt with Subtask 2: Technique Classification. We used shallow Natural Language Processing (NLP) preprocessing techniques to reduce the noise in the dataset, feature selection methods, and common supervised machine learning algorithms. Our final model is based on using the BERT system with entity mapping. To improve our model’s accuracy, we mapped certain words into five distinct categories by employing word-classes and entity recognition

2019

pdf bib
DUTH at SemEval-2019 Task 8: Part-Of-Speech Features for Question Classification
Anastasios Bairaktaris | Symeon Symeonidis | Avi Arampatzis
Proceedings of the 13th International Workshop on Semantic Evaluation

This report describes the methods employed by the Democritus University of Thrace (DUTH) team for participating in SemEval-2019 Task 8: Fact Checking in Community Question Answering Forums. Our team dealt only with Subtask A: Question Classification. Our approach was based on shallow natural language processing (NLP) pre-processing techniques to reduce the noise in data, feature selection methods, and supervised machine learning algorithms such as NearestCentroid, Perceptron, and LinearSVC. To determine the essential features, we were aided by exploratory data analysis and visualizations. In order to improve classification accuracy, we developed a customized list of stopwords, retaining some opinion- and fact-denoting common function words which would have been removed by standard stoplisting. Furthermore, we examined the usefulness of part-of-speech (POS) categories for the task; by trying to remove nouns and adjectives, we found some evidence that verbs are a valuable POS category for the opinion question class.