Lay-Ki Soon


2021

pdf bib
Effective Use of Graph Convolution Network and Contextual Sub-Tree for Commodity News Event Extraction
Meisin Lee | Lay-Ki Soon | Eu-Gene Siew
Proceedings of the Third Workshop on Economics and Natural Language Processing

Event extraction in commodity news is a less researched area as compared to generic event extraction. However, accurate event extraction from commodity news is useful in abroad range of applications such as under-standing event chains and learning event-event relations, which can then be used for commodity price prediction. The events found in commodity news exhibit characteristics different from generic events, hence posing a unique challenge in event extraction using existing methods. This paper proposes an effective use of Graph Convolutional Networks(GCN) with a pruned dependency parse tree, termed contextual sub-tree, for better event ex-traction in commodity news. The event ex-traction model is trained using feature embed-dings from ComBERT, a BERT-based masked language model that was produced through domain-adaptive pre-training on a commodity news corpus. Experimental results show the efficiency of the proposed solution, which out-performs existing methods with F1 scores as high as 0.90. Furthermore, our pre-trained language model outperforms GloVe by 23%, and BERT and RoBERTa by 7% in terms of argument roles classification. For the goal of re-producibility, the code and trained models are made publicly available.

2019

pdf bib
Hybrid Models for Aspects Extraction without Labelled Dataset
Wai-Howe Khong | Lay-Ki Soon | Hui-Ngo Goh
Proceedings of the Second Workshop on Fact Extraction and VERification (FEVER)

One of the important tasks in opinion mining is to extract aspects of the opinion target. Aspects are features or characteristics of the opinion target that are being reviewed, which can be categorised into explicit and implicit aspects. Extracting aspects from opinions is essential in order to ensure accurate information about certain attributes of an opinion target is retrieved. For instance, a professional camera receives a positive feedback in terms of its functionalities in a review, but its overly high price receives negative feedback. Most of the existing solutions focus on explicit aspects. However, sentences in reviews normally do not state the aspects explicitly. In this research, two hybrid models are proposed to identify and extract both explicit and implicit aspects, namely TDM-DC and TDM-TED. The proposed models combine topic modelling and dictionary-based approach. The models are unsupervised as they do not require any labelled dataset. The experimental results show that TDM-DC achieves F1-measure of 58.70%, where it outperforms both the baseline topic model and dictionary-based approach. In comparison to other existing unsupervised techniques, the proposed models are able to achieve higher F1-measure by approximately 3%. Although the supervised techniques perform slightly better, the proposed models are domain-independent, and hence more versatile.

2013

pdf bib
Context-Dependent Multilingual Lexical Lookup for Under-Resourced Languages
Lian Tze Lim | Lay-Ki Soon | Tek Yong Lim | Enya Kong Tang | Bali Ranaivo-Malançon
Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)