Thi Minh Huyen Nguyen

Also published as: T. M. Huyen Nguyen, Thi Minh Huyen Nguyen, Thi Minh Huyền Nguyễn, Thi-Minh-Huyen Nguyen, Thị Minh Huyền Nguyễn


Overview of VLSP RelEx shared task: A Data Challenge for Semantic Relation Extraction from Vietnamese News
Vu Tran Mai | Hoang-Quynh Le | Duy-Cat Can | Thi Minh Huyen Nguyen | Tran Ngoc Linh Nguyen | Thanh Tam Doan
Proceedings of the 7th International Workshop on Vietnamese Language and Speech Processing


Automated Extraction of Tree Adjoining Grammars from a Treebank for Vietnamese
Phuong Le-Hong | Thi Minh Huyen Nguyen | Phuong Thai Nguyen | Azim Roussanaly
Proceedings of the 10th International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+10)

An empirical study of maximum entropy approach for part-of-speech tagging of Vietnamese texts
Phuong Le-Hong | Azim Roussanaly | Thi Minh Huyen Nguyen | Mathias Rossignol
Actes de la 17e conférence sur le Traitement Automatique des Langues Naturelles. Articles longs

This paper presents an empirical study on the application of the maximum entropy approach for part-of-speech tagging of Vietnamese text, a language with special characteristics which largely distinguish it from occidental languages. Our best tagger explores and includes useful knowledge sources for tagging Vietnamese text and gives a 93.40%overall accuracy and a 80.69%unknown word accuracy on a test set of the Vietnamese treebank. Our tagger significantly outperforms the tagger that is being used for building the Vietnamese treebank, and as far as we are aware, this is the best tagging result ever published for the Vietnamese language.


Building a Large Syntactically-Annotated Corpus of Vietnamese
Phuong-Thai Nguyen | Xuan-Luong Vu | Thi-Minh-Huyen Nguyen | Van-Hiep Nguyen | Hong-Phuong Le
Proceedings of the Third Linguistic Annotation Workshop (LAW III)

Finite-State Description of Vietnamese Reduplication
Phuong Le Hong | Thi Minh Huyen Nguyen | Azim Roussanaly
Proceedings of the 7th Workshop on Asian Language Resources (ALR7)


Word Segmentation of Vietnamese Texts: a Comparison of Approaches
Quang Thắng Đinh | Hồng Phương Lê | Thị Minh Huyền Nguyễn | Cẩm Tú Nguyễn | Mathias Rossignol | Xuân Lương Vũ
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

We present in this paper a comparison between three segmentation systems for the Vietnamese language. Indeed, the majority of Vietnamese words is built by semantic composition from about 7,000 syllables, which also have a meaning as isolated words. So the identification of word boundaries in a text is not a simple task, and ambiguities often appear. Beyond the presentation of the tested systems, we also propose a standard definition for word segmentation in Vietnamese, and introduce a reference corpus developed for the purpose of evaluating such a task. The results observed confirm that it can be relatively well treated by automatic means, although a solution needs to be found to take into account out-of-vocabulary words.

A Metagrammar for Vietnamese LTAG
Phương Lê Hồng | Thị Minh Huyền Nguyễn | Azim Roussanaly
Proceedings of the Ninth International Workshop on Tree Adjoining Grammar and Related Frameworks (TAG+9)


Evaluation of multilingual text alignment systems: the ARCADE II project
Yun-Chuang Chiao | Olivier Kraif | Dominique Laurent | Thi Minh Huyen Nguyen | Nasredine Semmar | François Stuck | Jean Véronis | Wajdi Zaghouani
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

This paper describes the ARCADE II project, concerned with the evaluation of parallel text alignment systems. The ARCADE II project aims at exploring the techniques of multilingual text alignment through a fine evaluation of the existing techniques and the development of new alignment methods. The evaluation campaign consists of two tracks devoted to the evaluation of alignment at sentence and word level respectively. It differs from ARCADE I in the multilingual aspect and the investigation of lexical alignment.

A Lexicalized Tree-Adjoining Grammar for Vietnamese
H. Phuong Le | T. M. Huyen Nguyen | Laurent Romary | Azim Roussanaly
Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06)

In this paper, we present the first sizable grammar built for Vietnamese using LTAG, developed over the past two years, named vnLTAG. This grammar aims at modelling written language and is general enough to be both application- and domain-independent. It can be used for the morpho-syntactic tagging and syntactic parsing of Vietnamese texts, as well as text generation. We then present a robust parsing scheme using vnLTAG and a parser for the grammar. We finish with an evaluation using a test suite.

A language-independent method for the alignement of parallel corpora
Thi Minh Huyền Nguyễn | Mathias Rossignol
Proceedings of the 20th Pacific Asia Conference on Language, Information and Computation


Developping Tools and Building Linguistic Resources for Vietnamese Morpho-syntactic Processing
Thanh Bon Nguyen | Thi Minh Huyen Nguyen | Laurent Romary | Xuan Luong Vu
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)