Thìn Đặng Văn

Also published as: Dang Thin, Thin Dang Van, Thìn Đặng Văn, Thin Dang Van, Thìn Dang Van

2026

DNT at #SMM4H–HeaRD 2026: Leveraging BERT-based Encoders and LLMs for Medical Information Extraction
Doan Nhat Tien | Thìn Đặng Văn
Proceedings of the 11th Social Media Mining for Health Research and Applications (SMM4H-HeaRD 2026) Workshop and Shared Tasks

This paper presents our systems for two tasks at #SMM4H-HeaRD 2026. For Task 1 (multilingual Adverse Drug Event detection), we fine-tune BERT-based multilingual models (InfoXLM and XLM-RoBERTa) and Qwen3.5-9B with ensemble methods, achieving 0.8584 macro F1 on the development set and 0.5304 F1 on unseen Farsi. For Task 7 (span detection of ClinicalImpacts and SocialImpacts in opioid narratives), DeBERTa-Large with simplified labeling achieves the best test performance (0.583 relaxed F1, 0.500 strict F1). Our analysis shows that LLMs excel on known languages in Task 1, while transformer-based models with simplified labeling generalize better for NER tasks.

pdf bib abs

CITD@UIT at SemEval-2026 Task 4: Structured Reasoning and Metric Specialization for Narrative Similarity
Thach Nguyen | Duc-Vu Nguyen | Dang Thin
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We present a synergistic dual-track approach for SemEval-2026 Task 4 on narrative similarity, covering Track A (triple-wise classification) and Track B (narrative representation) through failure-driven data enrichment. The shared task received 71 final submissions from 46 teams across its two tracks. For Track A, we explore three reasoning strategies: hybrid Cross-Encoder–LLM arbitration (66.5% dev), DSPy-based component-wise decomposition (68.0% dev), and a multi-stage pairwise reasoning pipeline with enforced moral agency hierarchies, where the final Gemini 2.5 Pro/Flash system achieves 77.39% on development and 69.25% on test data, ranking 17th among 46 participating teams in the official evaluation. For Track B, we propose BGE-M3 (LoRA), an instruction-guided dense representation model trained with Multiple Negatives Ranking Loss (MNRL); since Track B provides only unlabeled story instances, we specialize the embedding space using adversarial samples synthesized from Track A failure cases, achieving 68.75% in the official evaluation and ranking 6th among 26 participating teams. Our analysis shows that narrative similarity depends more on outcome alignment and moral trajectory than lexical overlap, highlighting the complementary roles of explicit reasoning and task-specific metric-space specialization.

pdf bib abs

Stochastic Gradient Descenders at SemEval-2026 Task 9: Few-Shot LLM Prompting for Polarization Type Classification
Huynh Phu | Dang Thin
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

This paper presents our system for SemEval-2026 Task~9 (POLAR), Subtask~2, which focuses on classifying polarization types in social media text. We investigate three paradigms: (i) fine-tuning mDeBERTa-v3 with domain-adaptive pre-training, (ii) parameter-efficient adaptation of Qwen2.5-32B using LoRA, and (iii) few-shot prompting with Llama-3.3-70B-Instruct. Experimental results show that few-shot prompting, despite requiring no task-specific training, outperforms both fine-tuning and parameter-efficient approaches. Notably, it achieves non-zero F1 scores across all polarization categories, which is critical under macro-averaged evaluation. Our system ranks 2nd out of 29 English submissions on the official leaderboard, achieving an F1 Macro of 0.5157. These findings highlight the effectiveness of large instruction-tuned models in low-resource, label-imbalanced classification settings.

pdf bib abs

Gradient Descenders at SemEval-2026 Task 9: Data-Centric Counterfactual Augmentation for Multi-Label Hate Speech Detection
Tran Nhan | Dang Thin
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

In this paper, we describe the Gradient Descenders submission to SemEval-2026 Task 9 Subtask 2: Multi-Label Hate Speech Detection. Existing Transformer-based approaches often exhibit degraded performance on this task due to severe class imbalance and complex class intersectionality, leading to the learning of spurious correlations. To counteract this, we introduce a novel, data-centric counterfactual augmentation pipeline. We employ Large Language Models (LLMs) as semantic generators to synthesize diverse, targeted training samples via three distinct prompting strategies: Additive Label-Flipping (Attribute Injection), Context Decoupling, and Cross-Domain Identity Substitution. Fine-tuning a RoBERTa classifier on this augmented corpus significantly improves the model’s sensitivity to minority classes. Ultimately, our system achieves a Macro-F1 score of 44.15\% on the official test set, highlighting the efficacy of targeted LLM-based augmentation in highly imbalanced, multi-label environments.

pdf bib abs

An NLP Framework for Analyzing Corporate Strategic Behavior in the Opioid Industry Documents Archive
Duy Dang Phu | Thìn Đặng Văn
Proceedings of the Seventh Workshop on Natural Language Processing and Computational Social Science

The Opioid Industry Documents Archive (OIDA) provides extensive internal corporate records that offer valuable insight into the drivers of the opioid crisis, yet its use in systematic analysis of corporate strategy remains limited. In this study, we propose an NLP-based framework to analyze strategic behavior in large-scale litigation archives, combining relevance filtering and topic modeling with large language model (LLM)-assisted interpretation. Applied to documents from Insys Therapeutics and Mallinckrodt Pharmaceuticals, our approach uncovers systematic differences in corporate strategies and organizational priorities. These results highlight the potential of integrating representation learning and LLMs for large-scale analysis in public health and corporate accountability research.

pdf bib abs

EduPulse: A Practical LLM-Enhanced Opinion Mining System for Vietnamese Student Feedback in Educational Platforms
Nguyen Xuan Phuc | Phi Nguyen Xuan | Vinh-Tiep Nguyen | Thìn Dang Van | Ngan Luu-Thuy Nguyen
Proceedings of the 19th Conference of the European Chapter of the Association for Computational Linguistics (Volume 5: Industry Track)

Opinion mining from real-world student feedback presents significant practical challenges, such as handling linguistic noise (slang, teencode) and the need for scalable and maintainable systems, which are often overlooked in academic research. This paper introduces EduPulse, a practical opinion mining system designed specifically to analyze student feedback in Vietnamese. Our application performs four opinion analysis tasks, including Sentiment Classification, Category-based Sentiment Classification, Suggestion Detection, and Opinion Summarization. We design the hybrid architecture that strategically balances performance, cost, and maintainability. This architecture leverages the robustness of Large Language Models (LLMs) for complex, noise-sensitive tasks as sentiment classification and suggestion detection, while employing a specialized, lightweight neural model for high-throughput, low-cost solutions. Our experiments show that applying the LLM-based approach achieves high robustness, justifying its operational cost by eliminating expensive retraining cycles. Furthermore, we demonstrate that our collaborative modular architecture significantly improves task performance (+7.6%) compared to traditional approaches, offering a practical design for industry-focused Natural Language Processing applications.

pdf bib abs

PhucNguyen@DravidianLangTech 2026: Political Multiclass Sentiment Analysis with XLM-RoBERTa and Low-Rank Adaptation
Dinh Khac Phuc Nguyen | Thìn Đặng Văn
Proceedings of the Sixth Workshop on Speech, Vision, and Language Technologies for Dravidian Languages

Analyzing political sentiment in code-mixed Tamil-English presents significant challenges due to informal jargon, severe class imbalance, and distribution shifts. This paper describes our system for the Political Multiclass Sentiment Analysis shared task at DravidianLangTech@ACL 2026, which categorizes tweets into seven sentiment classes. Our approach leverages XLM-RoBERTa integrated with Low-Rank Adaptation (LoRA). To mitigate majority-class dominance, we combine random oversampling with automated hyperparameter optimization to improve macro-level balance within this Parameter-Efficient Fine-Tuning (PEFT) framework. Enhanced by targeted preprocessing—specifically emoji demojization and noise removal—our system helps preserve nuanced symbolic cues, achieving a macro-average F1-score of 0.3763 and securing Rank 2 on the shared task leaderboard.

2025

pdf bib abs

NTA at SemEval-2025 Task 11: Enhanced Multilingual Textual Multi-label Emotion Detection via Integrated Augmentation Learning
Nguyen Pham Hoang Le | An Nguyen Tran Khuong | Tram Nguyen Thi Ngoc | Thin Dang Van
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

Emotion detection in text is crucial for various applications, but progress, especially in multi-label scenarios, is often hampered by data scarcity, particularly for low-resource languages like Emakhuwa and Tigrinya. This lack of data limits model performance and generalizability. To address this, the NTA team developed a system for SemEval-2025 Task 11, leveraging data augmentation techniques: swap, deletion, oversampling, emotion-focused synonym insertion and synonym replacement to enhance baseline models for multilingual textual multi-label emotion detection. Our proposed system achieved significantly higher macro F1-scores compared to the baseline across multiple languages, demonstrating a robust approach to tackling data scarcity. This resulted in a 17th place overall ranking on the private leaderboard, and remarkably, we achieved the highest score and became the winner in Tigrinya language, demonstrating the effectiveness of our approach in a low-resource setting.

pdf bib abs

A.M.P at SciHal2025: Automated Hallucination Detection in Scientific Content via LLMs and Prompt Engineering
Le Nguyen Anh Khoa | Thìn Đặng Văn
Proceedings of the Fifth Workshop on Scholarly Document Processing (SDP 2025)

This paper presents our system developed for SciHal2025: Hallucination Detection for Scientific Content. The primary goal of this task is to detect hallucinated claims based on the corresponding reference. Our methodology leverages strategic prompt engineering to enhance LLMs’ ability to accurately distinguish between factual assertions and hallucinations in scientific contexts. Moreover, we discovered that aggregating the fine-grained classification results from the more complex subtask (subtask 2) into the simplified label set required for the simpler subtask (subtask 1) significantly improved performance compared to direct classification for subtask 1. This work contributes to the development of more reliable AI-powered research tools by providing a systematic framework for hallucination detection in scientific content.

pdf bib abs

MMLabUIT at CoMeDiShared Task: Text Embedding Techniques versus Generation-Based NLI for Median Judgment Classification
Tai Duc Le | Thin Dang Van
Proceedings of Context and Meaning: Navigating Disagreements in NLP Annotation

This paper presents our approach in the COLING2025-CoMeDi task in 7 languages, focusing on sub-task 1: Median Judgment Classification with Ordinal Word-in-Context Judgments (OGWiC). Specifically, we need to determine the meaning relation of one word in two different contexts and classify the input into 4 labels. To address sub-task 1, we implement and investigate various solutions, including (1) Stacking, Averaged Embedding techniques with a multilingual BERT-based model; and (2) utilizing a Natural Language Inference approach instead of a regular classification process. All the experiments were conducted on the P100 GPU from the Kaggle platform. To enhance the context of input, we perform Improve Known Data Rate and Text Expansion in some languages. For model focusing purposes Custom Token was used in the data processing pipeline. Our best official results on the test set are 0.515, 0.518, and 0.524 in terms of Krippendorff’s α score on task 1. Our participation system achieved a Top 3 ranking in task 1. Besides the official result, our best approach also achieved 0.596 regarding Krippendorff’s α score on Task 1.

pdf bib

TranTranUIT at MAHED Shared Task: Multilingual Transformer Ensemble with Advanced Data Augmentation and Optuna-based Hyperparameter Optimization
Trinh Tran Tran | Thìn Đặng Văn
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

NguyenTriet at MAHED Shared Task: Ensemble of Arabic BERT Models with Hierarchical Prediction and Soft Voting for Text-Based Hope and Hate Detection
Nguyen Minh Triet | Thìn Đặng Văn
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

912 at TAQEEM 2025: A Distribution-aware Approach to Arabic Essay Scoring
Trong-Tai Dam Vu | Thìn Đặng Văn
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks

pdf bib

PuxAI at QIAS 2025: Multi-Agent Retrieval-Augmented Generation for Islamic Inheritance and Knowledge Reasoning
Nguyen Xuan Phuc | Thìn Đặng Văn
Proceedings of The Third Arabic Natural Language Processing Conference: Shared Tasks