Yong Dai


2021

pdf bib
Creating and Evaluating Resources for Sentiment Analysis in the Low-resource Language: Sindhi
Wazir Ali | Naveed Ali | Yong Dai | Jay Kumar | Saifullah Tumrani | Zenglin Xu
Proceedings of the Eleventh Workshop on Computational Approaches to Subjectivity, Sentiment and Social Media Analysis

In this paper, we develop Sindhi subjective lexicon using a merger of existing English resources: NRC lexicon, list of opinion words, SentiWordNet, Sindhi-English bilingual dictionary, and collection of Sindhi modifiers. The positive or negative sentiment score is assigned to each Sindhi opinion word. Afterwards, we determine the coverage of the proposed lexicon with subjectivity analysis. Moreover, we crawl multi-domain tweet corpus of news, sports, and finance. The crawled corpus is annotated by experienced annotators using the Doccano text annotation tool. The sentiment annotated corpus is evaluated by employing support vector machine (SVM), recurrent neural network (RNN) variants, and convolutional neural network (CNN).

pdf bib
Contextualize Knowledge Bases with Transformer for End-to-end Task-Oriented Dialogue Systems
Yanjie Gou | Yinjie Lei | Lingqiao Liu | Yong Dai | Chunxu Shen
Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing

Incorporating knowledge bases (KB) into end-to-end task-oriented dialogue systems is challenging, since it requires to properly represent the entity of KB, which is associated with its KB context and dialogue context. The existing works represent the entity with only perceiving a part of its KB context, which can lead to the less effective representation due to the information loss, and adversely favor KB reasoning and response generation. To tackle this issue, we explore to fully contextualize the entity representation by dynamically perceiving all the relevant entities and dialogue history. To achieve this, we propose a COntext-aware Memory Enhanced Transformer framework (COMET), which treats the KB as a sequence and leverages a novel Memory Mask to enforce the entity to only focus on its relevant entities and dialogue history, while avoiding the distraction from the irrelevant entities. Through extensive experiments, we show that our COMET framework can achieve superior performance over the state of the arts.