Cuong Xuan Chu


2025

pdf bib
Automotive Document Labeling Using Large Language Models
Dang Van Thin | Cuong Xuan Chu | Christian Graf | Tobias Kaminski | Trung-Kien Tran
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing: Industry Track

Repairing and maintaining car parts are crucial tasks in the automotive industry, requiring a mechanic to have all relevant technical documents available. However, retrieving the right documents from a huge database heavily depends on domain expertise and is time consuming and error-prone. By labeling available documents according to the components they relate to, concise and accurate information can be retrieved efficiently. However, this is a challenging task as the relevance of a document to a particular component strongly depends on the context and the expertise of the domain specialist. Moreover, component terminology varies widely between different manufacturers. We address these challenges by utilizing Large Language Models (LLMs) to enrich and unify a component database via web mining, extracting relevant keywords, and leveraging hybrid search and LLM-based re-ranking to select the most relevant component for a document. We systematically evaluate our method using various LLMs on an expert-annotated dataset and demonstrate that it outperforms the baselines, which rely solely on LLM prompting.

2020

pdf bib
ENTYFI: A System for Fine-grained Entity Typing in Fictional Texts
Cuong Xuan Chu | Simon Razniewski | Gerhard Weikum
Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

Fiction and fantasy are archetypes of long-tail domains that lack suitable NLP methodologies and tools. We present ENTYFI, a web-based system for fine-grained typing of entity mentions in fictional texts. It builds on 205 automatically induced high-quality type systems for popular fictional domains, and provides recommendations towards reference type systems for given input texts. Users can exploit the richness and diversity of these reference type systems for fine-grained supervised typing, in addition, they can choose among and combine four other typing modules: pre-trained real-world models, unsupervised dependency-based typing, knowledge base lookups, and constraint-based candidate consolidation. The demonstrator is available at: https://d5demos.mpi-inf.mpg.de/entyfi.

2017

pdf bib
Sequence to Sequence Learning for Event Prediction
Dai Quoc Nguyen | Dat Quoc Nguyen | Cuong Xuan Chu | Stefan Thater | Manfred Pinkal
Proceedings of the Eighth International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

This paper presents an approach to the task of predicting an event description from a preceding sentence in a text. Our approach explores sequence-to-sequence learning using a bidirectional multi-layer recurrent neural network. Our approach substantially outperforms previous work in terms of the BLEU score on two datasets derived from WikiHow and DeScript respectively. Since the BLEU score is not easy to interpret as a measure of event prediction, we complement our study with a second evaluation that exploits the rich linguistic annotation of gold paraphrase sets of events.

2013

pdf bib
Learning Based Approaches for Vietnamese Question Classification Using Keywords Extraction from the Web
Dang Hai Tran | Cuong Xuan Chu | Son Bao Pham | Minh Le Nguyen
Proceedings of the Sixth International Joint Conference on Natural Language Processing