See Kiong Ng

Also published as: See-Kiong Ng

2022

pdf abs
Are All the Datasets in Benchmark Necessary? A Pilot Study of Dataset Evaluation for Text Classification
Yang Xiao | Jinlan Fu | See-Kiong Ng | Pengfei Liu
Proceedings of the 2022 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

In this paper, we ask the research question of whether all the datasets in the benchmark are necessary. We approach this by first characterizing the distinguishability of datasets when comparing different systems. Experiments on 9 datasets and 36 systems show that several existing benchmark datasets contribute little to discriminating top-scoring systems, while those less used datasets exhibit impressive discriminative power. We further, taking the text classification task as a case study, investigate the possibility of predicting dataset discrimination based on its properties (e.g., average sentence length). Our preliminary experiments promisingly show that given a sufficient number of training experimental records, a meaningful predictor can be learned to estimate dataset discrimination over unseen datasets. We released all datasets with features explored in this work on DataLab.

pdf abs
Polyglot Prompt: Multilingual Multitask Prompt Training
Jinlan Fu | See-Kiong Ng | Pengfei Liu
Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing

This paper aims for a potential architectural improvement for multilingual learning and asks: Can different tasks from different languages be modeled in a monolithic framework, i.e. without any task/language-specific module? The benefit of achieving this could open new doors for future multilingual research, including allowing systems trained on low resources to be further assisted by other languages as well as other tasks. We approach this goal by developing a learning framework named Polyglot Prompting to exploit prompting methods for learning a unified semantic space for different languages and tasks with multilingual prompt engineering. We performed a comprehensive evaluation of 6 tasks, namely topic classification, sentiment classification, named entity recognition, question answering, natural language inference, and summarization, covering 24 datasets and 49 languages. The experimental results demonstrated the efficacy of multilingual multitask prompt-based learning and led to inspiring observations. We also present an interpretable multilingual evaluation methodology and show how the proposed framework, multilingual multitask prompt training, works. We release all datasets prompted in the best setting and code.

pdf abs
CorefDiffs: Co-referential and Differential Knowledge Flow in Document Grounded Conversations
Lin Xu | Qixian Zhou | Jinlan Fu | Min-Yen Kan | See-Kiong Ng
Proceedings of the 29th International Conference on Computational Linguistics

Knowledge-grounded dialog systems need to incorporate smooth transitions among knowledge selected for generating responses, to ensure that dialog flows naturally. For document-grounded dialog systems, the inter- and intra-document knowledge relations can be used to model such conversational flows. We develop a novel Multi-Document Co-Referential Graph (Coref-MDG) to effectively capture the inter-document relationships based on commonsense and similarity and the intra-document co-referential structures of knowledge segments within the grounding documents. We propose CorefDiffs, a Co-referential and Differential flow management method, to linearize the static Coref-MDG into conversational sequence logic. CorefDiffs performs knowledge selection by accounting for contextual graph structures and the knowledge difference sequences. CorefDiffs significantly outperforms the state-of-the-art by 9.5%, 7.4% and 8.2% on three public benchmarks. This demonstrates that the effective modeling of co-reference and knowledge difference for dialog flows are critical for transitions in document-grounded conversation.

pdf abs
QSTS: A Question-Sensitive Text Similarity Measure for Question Generation
Sujatha Das Gollapalli | See-Kiong Ng
Proceedings of the 29th International Conference on Computational Linguistics

While question generation (QG) has received significant focus in conversation modeling and text generation research, the problems of comparing questions and evaluation of QG models have remained inadequately addressed. Indeed, QG models continue to be evaluated using traditional measures such as BLEU, METEOR, and ROUGE scores which were designed for other text generation problems. We propose QSTS, a novel Question-Sensitive Text Similarity measure for comparing two questions by characterizing their target intent based on question class, named-entity, and semantic similarity information from the two questions. We show that QSTS addresses several shortcomings of existing measures that depend on n-gram overlap scores and obtains superior results compared to traditional measures on publicly-available QG datasets. We also collect a novel dataset SimQG, for enabling question similarity research in QG contexts. SimQG contains questions generated by state-of-the-art QG models along with human judgements on their relevance with respect to the passage context they were generated for as well as when compared to the given reference question. Using SimQG, we showcase the key aspect of QSTS that differentiates it from all existing measures. QSTS is not only able to characterize similarity between two questions, but is also able to score questions with respect to passage contexts. Thus QSTS is, to our knowledge, the first metric that enables the measurement of QG performance in a reference-free manner.

2021

pdf
NUS-IDS at FinCausal 2021: Dependency Tree in Graph Neural Network for Better Cause-Effect Span Detection
Fiona Anting Tan | See-Kiong Ng
Proceedings of the 3rd Financial Narrative Processing Workshop

pdf abs
Suicide Risk Prediction by Tracking Self-Harm Aspects in Tweets: NUS-IDS at the CLPsych 2021 Shared Task
Sujatha Das Gollapalli | Guilherme Augusto Zagatti | See-Kiong Ng
Proceedings of the Seventh Workshop on Computational Linguistics and Clinical Psychology: Improving Access

We describe our system for identifying users at-risk for suicide based on their tweets developed for the CLPsych 2021 Shared Task. Based on research in mental health studies linking self-harm tendencies with suicide, in our system, we attempt to characterize self-harm aspects expressed in user tweets over a period of time. To this end, we design SHTM, a Self-Harm Topic Model that combines Latent Dirichlet Allocation with a self-harm dictionary for modeling daily tweets of users. Next, differences in moods and topics over time are captured as features to train a deep learning model for suicide prediction.

pdf abs
NUS-IDS at CASE 2021 Task 1: Improving Multilingual Event Sentence Coreference Identification With Linguistic Information
Fiona Anting Tan | Sujatha Das Gollapalli | See-Kiong Ng
Proceedings of the 4th Workshop on Challenges and Applications of Automated Extraction of Socio-political Events from Text (CASE 2021)

Event Sentence Coreference Identification (ESCI) aims to cluster event sentences that refer to the same event together for information extraction. We describe our ESCI solution developed for the ACL-CASE 2021 shared tasks on the detection and classification of socio-political and crisis event information in a multilingual setting. For a given article, our proposed pipeline comprises of an accurate sentence pair classifier that identifies coreferent sentence pairs and subsequently uses these predicted probabilities to cluster sentences into groups. Sentence pair representations are constructed from fine-tuned BERT embeddings plus POS embeddings fed through a BiLSTM model, and combined with linguistic-based lexical and semantic similarities between sentences. Our best models ranked 2nd, 1st and 2nd and obtained CoNLL F1 scores of 81.20%, 93.03%, 83.15% for the English, Portuguese and Spanish test sets respectively in the ACL-CASE 2021 competition.

pdf bib abs
Causal Augmentation for Causal Sentence Classification
Fiona Anting Tan | Devamanyu Hazarika | See-Kiong Ng | Soujanya Poria | Roger Zimmermann
Proceedings of the First Workshop on Causal Inference and NLP

Scarcity of annotated causal texts leads to poor robustness when training state-of-the-art language models for causal sentence classification. In particular, we found that models misclassify on augmented sentences that have been negated or strengthened with respect to its causal meaning. This is worrying since minor linguistic differences in causal sentences can have disparate meanings. Therefore, we propose the generation of counterfactual causal sentences by creating contrast sets (Gardner et al., 2020) to be included during model training. We experimented on two model architectures and predicted on two out-of-domain corpora. While our strengthening schemes proved useful in improving model performance, for negation, regular edits were insufficient. Thus, we also introduce heuristics like shortening or multiplying root words of a sentence. By including a mixture of edits when training, we achieved performance improvements beyond the baseline across both models, and within and out of corpus’ domain, suggesting that our proposed augmentation can also help models generalize.

pdf abs
On Generating Fact-Infused Question Variations
Arthur Deschamps | Sujatha Das Gollapalli | See-Kiong Ng
Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2021)

To fully model human-like ability to ask questions, automatic question generation (QG) models must be able to produce multiple expressions of the same question with different levels of detail. Unfortunately, existing datasets available for learning QG do not include paraphrases or question variations affecting a model’s ability to learn this capability. We present FIRS, a dataset containing human-generated fact-infused rewrites of questions from the widely-used SQuAD dataset to address this limitation. Questions in FIRS were obtained by combining a given question with facts of entities referenced in the question. We study a double encoder-decoder model, Fact-Infused Question Generator (FIQG), for learning to generate fact-infused questions from a given question. Experimental results show that FIQG effectively incorporates information from facts to add more detail to a given question. To the best of our knowledge, ours is the first study to present fact-infusion as a novel form of question paraphrasing.

2020

pdf abs
ESTeR: Combining Word Co-occurrences and Word Associations for Unsupervised Emotion Detection
Sujatha Das Gollapalli | Polina Rozenshtein | See-Kiong Ng
Findings of the Association for Computational Linguistics: EMNLP 2020

Accurate detection of emotions in user- generated text was shown to have several applications for e-commerce, public well-being, and disaster management. Currently, the state-of-the-art performance for emotion detection in text is obtained using complex, deep learning models trained on domain-specific, labeled data. In this paper, we propose ESTeR , an unsupervised model for identifying emotions using a novel similarity function based on random walks on graphs. Our model combines large-scale word co-occurrence information with word-associations from lexicons avoiding not only the dependence on labeled datasets, but also an explicit mapping of words to latent spaces used in emotion-enriched word embeddings. Our similarity function can also be computed efficiently. We study a range of datasets including recent tweets related to COVID-19 to illustrate the superior performance of our model and report insights on public emotions during the on-going pandemic.

2016

pdf
Learning Term Embeddings for Taxonomic Relation Identification Using Dynamic Weighting Neural Network
Anh Tuan Luu | Yi Tay | Siu Cheung Hui | See Kiong Ng
Proceedings of the 2016 Conference on Empirical Methods in Natural Language Processing

pdf abs
Utilizing Temporal Information for Taxonomy Construction
Luu Anh Tuan | Siu Cheung Hui | See Kiong Ng
Transactions of the Association for Computational Linguistics, Volume 4

Taxonomies play an important role in many applications by organizing domain knowledge into a hierarchy of ‘is-a’ relations between terms. Previous work on automatic construction of taxonomies from text documents either ignored temporal information or used fixed time periods to discretize the time series of documents. In this paper, we propose a time-aware method to automatically construct and effectively maintain a taxonomy from a given series of documents preclustered for a domain of interest. The method extracts temporal information from the documents and uses a timestamp contribution function to score the temporal relevance of the evidence from source texts when identifying the taxonomic relations for constructing the taxonomy. Experimental results show that our proposed method outperforms the state-of-the-art methods by increasing F-measure up to 7%–20%. Furthermore, the proposed method can incrementally update the taxonomy by adding fresh relations from new data and removing outdated relations using an information decay function. It thus avoids rebuilding the whole taxonomy from scratch for every update and keeps the taxonomy effectively up-to-date in order to track the latest information trends in the rapidly evolving domain.

To combine the advantages of probabilistic grammars and generalized LR parsing, an algorithm for constructing a probabilistic LR parser given a probabilistic context-free grammar is needed. In this paper, implementation issues in adapting Tomita’s generalized LR parser with graph-structured stack to perform probabilistic parsing are discussed. Wright and Wrigley (1989) has proposed a probabilistic LR-table construction algorithm for non-left-recursive context-free grammars. To account for left recursions, a method for computing item probabilities using the generation of systems of linear equations is presented. The notion of deferred probabilities is proposed as a means for dealing with similar item sets with differing probability assignments.