Xingyi Song


2020

pdf
Using Deep Neural Networks with Intra- and Inter-Sentence Context to Classify Suicidal Behaviour
Xingyi Song | Johnny Downs | Sumithra Velupillai | Rachel Holden | Maxim Kikoler | Kalina Bontcheva | Rina Dutta | Angus Roberts
Proceedings of the Twelfth Language Resources and Evaluation Conference

Identifying statements related to suicidal behaviour in psychiatric electronic health records (EHRs) is an important step when modeling that behaviour, and when assessing suicide risk. We apply a deep neural network based classification model with a lightweight context encoder, to classify sentence level suicidal behaviour in EHRs. We show that incorporating information from sentences to left and right of the target sentence significantly improves classification accuracy. Our approach achieved the best performance when classifying suicidal behaviour in Autism Spectrum Disorder patient records. The results could have implications for suicidality research and clinical surveillance.

pdf
RP-DNN: A Tweet Level Propagation Context Based Deep Neural Networks for Early Rumor Detection in Social Media
Jie Gao | Sooji Han | Xingyi Song | Fabio Ciravegna
Proceedings of the Twelfth Language Resources and Evaluation Conference

Early rumor detection (ERD) on social media platform is very challenging when limited, incomplete and noisy information is available. Most of the existing methods have largely worked on event-level detection that requires the collection of posts relevant to a specific event and relied only on user-generated content. They are not appropriate to detect rumor sources in the very early stages, before an event unfolds and becomes widespread. In this paper, we address the task of ERD at the message level. We present a novel hybrid neural network architecture, which combines a task-specific character-based bidirectional language model and stacked Long Short-Term Memory (LSTM) networks to represent textual contents and social-temporal contexts of input source tweets, for modelling propagation patterns of rumors in the early stages of their development. We apply multi-layered attention models to jointly learn attentive context embeddings over multiple context inputs. Our experiments employ a stringent leave-one-out cross-validation (LOO-CV) evaluation setup on seven publicly available real-life rumor event data sets. Our models achieve state-of-the-art(SoA) performance for detecting unseen rumors on large augmented data which covers more than 12 events and 2,967 rumors. An ablation study is conducted to understand the relative contribution of each component of our proposed model.

2019

pdf
Team Bertha von Suttner at SemEval-2019 Task 4: Hyperpartisan News Detection using ELMo Sentence Representation Convolutional Network
Ye Jiang | Johann Petrak | Xingyi Song | Kalina Bontcheva | Diana Maynard
Proceedings of the 13th International Workshop on Semantic Evaluation

This paper describes the participation of team “bertha-von-suttner” in the SemEval2019 task 4 Hyperpartisan News Detection task. Our system uses sentence representations from averaged word embeddings generated from the pre-trained ELMo model with Convolutional Neural Networks and Batch Normalization for predicting hyperpartisan news. The final predictions were generated from the averaged predictions of an ensemble of models. With this architecture, our system ranked in first place, based on accuracy, the official scoring metric.

2018

pdf
A Deep Neural Network Sentence Level Classification Method with Context Information
Xingyi Song | Johann Petrak | Angus Roberts
Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing

In the sentence classification task, context formed from sentences adjacent to the sentence being classified can provide important information for classification. This context is, however, often ignored. Where methods do make use of context, only small amounts are considered, making it difficult to scale. We present a new method for sentence classification, Context-LSTM-CNN, that makes use of potentially large contexts. The method also utilizes long-range dependencies within the sentence being classified, using an LSTM, and short-span features, using a stacked CNN. Our experiments demonstrate that this approach consistently improves over previous methods on two different datasets.

2017

pdf
Comparing Attitudes to Climate Change in the Media using sentiment analysis based on Latent Dirichlet Allocation
Ye Jiang | Xingyi Song | Jackie Harrison | Shaun Quegan | Diana Maynard
Proceedings of the 2017 EMNLP Workshop: Natural Language Processing meets Journalism

News media typically present biased accounts of news stories, and different publications present different angles on the same event. In this research, we investigate how different publications differ in their approach to stories about climate change, by examining the sentiment and topics presented. To understand these attitudes, we find sentiment targets by combining Latent Dirichlet Allocation (LDA) with SentiWordNet, a general sentiment lexicon. Using LDA, we generate topics containing keywords which represent the sentiment targets, and then annotate the data using SentiWordNet before regrouping the articles based on topic similarity. Preliminary analysis identifies clearly different attitudes on the same issue presented in different news sources. Ongoing work is investigating how systematic these attitudes are between different publications, and how these may change over time.

2016

pdf
Sheffield Systems for the English-Romanian WMT Translation Task
Frédéric Blain | Xingyi Song | Lucia Specia
Proceedings of the First Conference on Machine Translation: Volume 2, Shared Task Papers

2014

pdf
Data selection for discriminative training in statistical machine translation
Xingyi Song | Lucia Specia | Trevor Cohn
Proceedings of the 17th Annual conference of the European Association for Machine Translation

2011

pdf
Regression and Ranking based Optimisation for Sentence Level MT Evaluation
Xingyi Song | Trevor Cohn
Proceedings of the Sixth Workshop on Statistical Machine Translation