Kiem-Hieu Nguyen


2021

pdf bib
An Uncertainty-Aware Encoder for Aspect Detection
Thi-Nhung Nguyen | Kiem-Hieu Nguyen | Young-In Song | Tuan-Dung Cao
Findings of the Association for Computational Linguistics: EMNLP 2021

Aspect detection is a fundamental task in opinion mining. Previous works use seed words either as priors of topic models, as anchors to guide the learning of aspects, or as features of aspect classifiers. This paper presents a novel weakly-supervised method to exploit seed words for aspect detection based on an encoder architecture. The encoder maps segments and aspects into a low-dimensional embedding space. The goal is approximating similarity between segments and aspects in the embedding space and their ground-truth similarity generated from seed words. An objective function is proposed to capture the uncertainty of ground-truth similarity. Our method outperforms previous works on several benchmarks in various domains.

2020

pdf bib
Utilizing Bert for Question Retrieval on Vietnameses E-commerce Sites
Thi-Thanh Ha | Van-Nha Nguyen | Kiem-Hieu Nguyen | Kim-Anh Nguyen | Tien-Thanh Nguyen
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

pdf bib
A Study on Seq2seq for Sentence Compressionin Vietnamese
Thi-Trang Nguyen | Huu-Hoang Nguyen | Kiem-Hieu Nguyen
Proceedings of the 34th Pacific Asia Conference on Language, Information and Computation

2018

pdf bib
BKTreebank: Building a Vietnamese Dependency Treebank
Kiem-Hieu Nguyen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2016

pdf bib
A Dataset for Open Event Extraction in English
Kiem-Hieu Nguyen | Xavier Tannier | Olivier Ferret | Romaric Besançon
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)

This article presents a corpus for development and testing of event schema induction systems in English. Schema induction is the task of learning templates with no supervision from unlabeled texts, and to group together entities corresponding to the same role in a template. Most of the previous work on this subject relies on the MUC-4 corpus. We describe the limits of using this corpus (size, non-representativeness, similarity of roles across templates) and propose a new, partially-annotated corpus in English which remedies some of these shortcomings. We make use of Wikinews to select the data inside the category Laws & Justice, and query Google search engine to retrieve different documents on the same events. Only Wikinews documents are manually annotated and can be used for evaluation, while the others can be used for unsupervised learning. We detail the methodology used for building the corpus and evaluate some existing systems on this new data.

2015

pdf bib
Désambiguïsation d’entités pour l’induction non supervisée de schémas événementiels
Kiem-Hieu Nguyen | Xavier Tannier | Olivier Ferret | Romaric Besançon
Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

Cet article présente un modèle génératif pour l’induction non supervisée d’événements. Les précédentes méthodes de la littérature utilisent uniquement les têtes des syntagmes pour représenter les entités. Pourtant, le groupe complet (par exemple, ”un homme armé”) apporte une information plus discriminante (que ”homme”). Notre modèle tient compte de cette information et la représente dans la distribution des schémas d’événements. Nous montrons que ces relations jouent un rôle important dans l’estimation des paramètres, et qu’elles conduisent à des distributions plus cohérentes et plus discriminantes. Les résultats expérimentaux sur le corpus de MUC-4 confirment ces progrès.

pdf bib
Generative Event Schema Induction with Entity Disambiguation
Kiem-Hieu Nguyen | Xavier Tannier | Olivier Ferret | Romaric Besançon
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

2014

pdf bib
Ranking Multidocument Event Descriptions for Building Thematic Timelines
Kiem-Hieu Nguyen | Xavier Tannier | Veronique Moriceau
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

2012

pdf bib
Semantic Relatedness for Biomedical Word Sense Disambiguation
Kiem-Hieu Nguyen | Cheol-Young Ock
Workshop Proceedings of TextGraphs-7: Graph-based Methods for Natural Language Processing