Dang Thin


2025

pdf bib
sonrobok4 Team at SemEval-2025 Task 8: Question Answering over Tabular Data Using Pandas and Large Language Models
Nguyen Son | Dang Thin
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper describes the system of the son robok4 team for the SemEval-2025 Task 8: DataBench, Question-Answering over Tabular Data. The task requires answering questions based on the given question and dataset ID, ensuring that the responses are derived solely from the provided table. We address this task by using large language models (LLMs) to translate natural language questions into executable Python code for querying Pandas DataFrames. Furthermore, we employ techniques such as a rerun mechanism for error handling, structured metadata extraction, and dataset preprocessing to enhance performance. Our best-performing system achieved 89.46% accuracy on Subtask 1 and placed in the top 4 on the private test set. Additionally, it achieved 85.25% accuracy on Subtask 2 and placed in the top 9. We mainly focus on Subtask 1. We analyze the effectiveness of different LLMs for structured data reasoning and discuss key challenges in tabular question answering.

pdf bib
ABCD at SemEval-2025 Task 9: BERT-based and Generation-based models combine with advanced weighted majority soft voting strategy
Tai Le | Dang Thin
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

first submission to SemEval-2025 task 9 by ABCD team

pdf bib
Firefly Team at SemEval-2025 Task 8: Question-Answering over Tabular Data using SQL/Python generation with Closed-Source Large Language Models
Nga Ho | Tuyen Ho | Hung Le | Dang Thin
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

In this paper, we describe our official system of the Firefly team for two main tasks in the SemEval-2025 Task 8: Question-Answering over Tabular Data. Our solution employs large language models (LLMs) to translate natural language queries into executable code, specifically Python and SQL, which are then used to generate answers categorized into five predefined types. Our empirical evaluation highlights the superiority of Python code generation over SQL for this challenge. Besides, the experimental results show that our system has achieved competitive performance in two subtasks. In Subtask I: Databench QA, where we rank the Top 9 across datasets of any size. Besides, our solution achieved competitive results and ranked 5th place in Subtask II: Databench QA Lite, where datasets are restricted to a maximum of 20 rows.

pdf bib
JellyK at SemEval-2025 Task 11: Russian Multi-label Emotion Detection with Pre-trained BERT-based Language Models
Khoa Le | Dang Thin
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper presents our approach for SemEval-2025 Task 11, we focus on on multi-label emotion detection in Russian text (track A). We preprocess the data by handling special characters, punctuation, and emotive expressions to improve feature-label relationships. To select the best model performance, we fine-tune various pre-trained language models specialized in Russian and evaluate them using K-FOLD Cross-Validation. Our results indicated that ruRoberta-large achieved the best Macro F1-score among tested models. Finally, our system achieved fifth place in the unofficial competition ranking.

2023

pdf bib
ABCD Team at SemEval-2023 Task 12: An Ensemble Transformer-based System for African Sentiment Analysis
Dang Thin | Dai Nguyen | Dang Qui | Duong Hao | Ngan Nguyen
Proceedings of the 17th International Workshop on Semantic Evaluation (SemEval-2023)

This paper describes the system of the ABCD team for three main tasks in the SemEval-2023 Task 12: AfriSenti-SemEval for Low-resource African Languages using Twitter Dataset. We focus on exploring the performance of ensemble architectures based on the soft voting technique and different pre-trained transformer-based language models. The experimental results show that our system has achieved competitive performance in some Tracks in Task A: Monolingual Sentiment Analysis, where we rank the Top 3, Top 2, and Top 4 for the Hause, Igbo and Moroccan languages. Besides, our model achieved competitive results and ranked $14ˆ{th}$ place in Task B (multilingual) setting and $14ˆ{th}$ and $8ˆ{th}$ place in Track 17 and Track 18 of Task C (zero-shot) setting.