Byron Galbraith


2018

This paper describes our approach to SemEval-2018 Task 7 – given an entity-tagged text from the ACL Anthology corpus, identify and classify pairs of entities that have one of six possible semantic relationships. Our model consists of a convolutional neural network leveraging pre-trained word embeddings, unlabeled ACL-abstracts, and multiple window sizes to automatically learn useful features from entity-tagged sentences. We also experiment with a hybrid loss function, a combination of cross-entropy loss and ranking loss, to boost the separation in classification scores. Lastly, we include WordNet-based features to further improve the performance of our model. Our best model achieves an F1(macro) score of 74.2 and 84.8 on subtasks 1.1 and 1.2, respectively.

2017

This paper describes our approach to the SemEval-2017 shared task of determining question-question similarity in a community question-answering setting (Task 3B). We extracted both syntactic and semantic similarity features between candidate questions, performed pairwise-preference learning to optimize for ranking order, and then trained a random forest classifier to predict whether the candidate questions are paraphrases of each other. This approach achieved a MAP of 45.7% out of max achievable 67.0% on the test set.