Yicen Tian


2025

pdf bib
111DUT at SemEval-2025 Task 8:Hierarchical Chain-of-Thought Reasoning and Multi-Model Deliberation for Robust TableQA
Jiaqi Yao | Erchen Yu | Yicen Tian | Yiyang Kang | Jiayi Zhang | Hongfei Lin | Linlin Zong | Bo Xu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

The proliferation of structured tabular data in domains like healthcare and finance has intensified the demand for precise table question answering, particularly for complex numerical reasoning and cross-domain generalization. Existing approaches struggle with implicit semantics and multi-step arithmetic operations. This paper presents our solution for SemEval-2025 task,including three synergistic components: (1) a Schema Profiler that extracts structural metadata via LLM-driven analysis and statistical validation, (2) a Hierarchical Chain-of-Thought module that decomposes questions into four stages(semantic anchoring, schema mapping, query synthesis, and self-correction)to ensure SQL validity, and (3) a Confidence-Accuracy Voting mechanism that resolves discrepancies across LLMs through weighted ensemble decisions. Our framework achieves scores of 81.23 on Databench and 81.99 on Databench_lite, ranking 6th and 5th respectively, demonstrating the effectiveness of structured metadata guidance and cross-model deliberation in complex TableQA scenarios.

pdf bib
DUTIR831 at SemEval-2025 Task 5: A Multi-Stage LLM Approach to GND Subject Assignment for TIBKAT Records
Yicen Tian | Erchen Yu | Yanan Wang | Dailin Li | Jiaqi Yao | Hongfei Lin | Linlin Zong | Bo Xu
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

This paper introduces DUTIR831’s approach to SemEval-2025 Task 5, which focuses on generating relevant subjects from the Integrated Authority File (GND) for tagging multilingual technical records in the TIBKAT database. To address challenges in understanding the hierarchical GND taxonomy and automating subject assignment, a three-stage approach is proposed: (1) a data synthesis stage that utilizes LLM to generate and selectively filter high-quality data, (2) a model training module that leverages LLMs and various training strategies to acquire GND knowledge and refine TIBKAT preferences, and (3) a subject terms completion mechanism consisting of multi-sampling ranking, subject terms extraction using a LLM, vector-based model retrieval, and various re-ranking strategies.The quantitative evaluation results show that our system is ranked 2nd in the all-subject datasets and 4th in the tib-core-subjects datasets. And the qualitative evaluation results show that the system is ranked 2nd in the tib-core-subjects datasets.

pdf bib
dutir914 at SemEval-2025 Task 1: An integrated approach for Multimodal Idiomaticity Representations
Yanan Wang | Dailin Li | Yicen Tian | Bo Zhang | Wang Jian | Liang Yang
Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025)

SemEval-2025 Task 1 introduces multimodal datasets for idiomatic expression representation. Subtask A focuses on ranking images based on potentially idiomatic noun compounds in given sentences. Idiom comprehension demands the fusion of visual and auditory elements with contextual semantics, yet existing datasets exhibit phrase-image discordance and culture-specific opacity, impeding cross-modal semantic alignment. To address these challenges, we propose an integrated approach that combines data augmentation and model fine-tuning in subtask A. First, we construct two idiom datasets by generating visual metaphors for idiomatic expressions to fine-tune the CLIP model. Next, We propose a three-stage multimodal chain-of-thought method, fine-tuning Qwen2.5-VL-7B-Instruct to generate rationales and perform inference, alongside zero-shot experiments with Qwen2.5-VL-72B-Instruct. Finally, we integrate the output of different models through a voting mechanism to enhance the accuracy of multimodal semantic matching. This approach achieves {textbf{0.92}} accuracy on the Portuguese test set and {textbf{0.93}} on the English test set, ranking {textbf{3rd}} and {textbf{4th}}, respectively. The implementation code is publicly available here{footnote{{url{ https://github.com/wyn1015/semeval}}}}.