Huy Tien Nguyen

Also published as: Huy-Tien Nguyen


2025

pdf bib
KDA: Knowledge Distillation Adapter for Cross-Lingual Transfer
Ta-Bao Nguyen | Nguyen-Phuong Phan | Tung Le | Huy Tien Nguyen
Proceedings of the 18th International Natural Language Generation Conference

State-of-the-art cross-lingual transfer often relies on massive multilingual models, but their prohibitive size and computational cost limit their practicality for low-resource languages. An alternative is to adapt powerful, task-specialized monolingual models, but this presents challenges in bridging the vocabulary and structural gaps between languages. To address this, we propose KDA, a Knowledge Distillation Adapter framework that efficiently adapts a fine-tuned, high-resource monolingual model to a low-resource target language. KDA utilizes knowledge distillation to transfer the source model’s task-solving capabilities to the target language in a parameter-efficient manner. In addition, we introduce a novel adapter architecture that integrates source-language token embeddings while learning new positional embeddings, directly mitigating cross-lingual representational mismatches. Our empirical results on zero-shot transfer for Vietnamese Sentiment Analysis demonstrate that KDA significantly outperforms existing methods, offering a new, effective, and computationally efficient pathway for cross-lingual transfer.

pdf bib
MoFin: A Small Vietnamese Language Model for Financial Reasoning via Reinforcement Learning
Nguyen Khac Duy | Vo Quang Tri | Le Ho Bao Nhat | Dinh Cong Huy Hoang | Ngo Quang Huy | Huy Tien Nguyen
Proceedings of the 11th International Workshop on Vietnamese Language and Speech Processing

2022

pdf bib
Bi-directional Cross-Attention Network on Vietnamese Visual Question Answering
Duy-Minh Nguyen-Tran | Tung Le | Minh Le Nguyen | Huy Tien Nguyen
Proceedings of the 36th Pacific Asia Conference on Language, Information and Computation

2020

pdf bib
Fast Word Predictor for On-Device Application
Huy Tien Nguyen | Khoi Tuan Nguyen | Anh Tuan Nguyen | Thanh Lac Thi Tran
Proceedings of the 28th International Conference on Computational Linguistics: System Demonstrations

Learning on large text corpora, deep neural networks achieve promising results in the next word prediction task. However, deploying these huge models on devices has to deal with constraints of low latency and a small binary size. To address these challenges, we propose a fast word predictor performing efficiently on mobile devices. Compared with a standard neural network which has a similar word prediction rate, the proposed model obtains 60% reduction in memory size and 100X faster inference time on a middle-end mobile device. The method is developed as a feature for a chat application which serves more than 100 million users.

2018

pdf bib
TSix: A Human-involved-creation Dataset for Tweet Summarization
Minh-Tien Nguyen | Dac Viet Lai | Huy-Tien Nguyen | Le-Minh Nguyen
Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018)

2017

pdf bib
An Ensemble Method with Sentiment Features and Clustering Support
Huy Tien Nguyen | Minh Le Nguyen
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

Deep learning models have recently been applied successfully in natural language processing, especially sentiment analysis. Each deep learning model has a particular advantage, but it is difficult to combine these advantages into one model, especially in the area of sentiment analysis. In our approach, Convolutional Neural Network (CNN) and Long Short Term Memory (LSTM) were utilized to learn sentiment-specific features in a freezing scheme. This scenario provides a novel and efficient way for integrating advantages of deep learning models. In addition, we also grouped documents into clusters by their similarity and applied the prediction score of Naive Bayes SVM (NBSVM) method to boost the classification accuracy of each group. The experiments show that our method achieves the state-of-the-art performance on two well-known datasets: IMDB large movie reviews for document level and Pang & Lee movie reviews for sentence level.