Chinh Nguyen

2026

UIT-AMMC at SemEval-2026 Task 13: Exploiting Structural Formatting Signatures for Robust AI-Generated Code Detection
Cuong Pham | Minh Nguyen | Minh Le | An Nguyen | Chinh Nguyen
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

We participated in Subtask A with our Structure-Aware Contrastive Cascade, a multi-stage architecture designed to distinguish between human-authored and machine-generated code by integrating generative reasoning with explicit structural linguistic features. Our system focuses on exploiting structural formatting signatures that frequently emerge in AI-generated code as a byproduct of post-training alignment and readability optimization. The pipeline utilizes a Qwen-2.5-Coder 14B model fine-tuned via QLoRA, incorporating stochastic data augmentation techniques to ensure robustness across unseen programming languages. Final classification is achieved through a late-fusion mechanism that combines contrastive probability scores with statistical metrics of code presentation density. For samples exhibiting high epistemic uncertainty, we implement a multi-agent adversarial debate step to refine the final verdict. This approach enabled our system to achieve a Macro F1 score of 0.802, ranking 3rd on the official leaderboard.

pdf bib abs

5ting at SemEval-2026 Task 8: Strong End-to-End Multi-Turn RAG via LLM-Based Reranking and Faithfulness Control
Thien-Qua T-Nguyen | Chi Hoang | Nguyen Tran | Tri Le | Khanh Truong | Chinh Nguyen
Proceedings of the 20th International Workshop on Semantic Evaluation (2026)

This paper presents a modular multi-turn Retrieval-Augmented Generation (RAG) system designed to mitigate hallucination, context drift, and underspecification. The pipeline combines dual-query merged retrieval and LLM-based reranking to deliver high-precision evidence, improving nDCG@5 by 17.7%. To strictly control hallucination during generation, we introduce a role-separated prompting strategy. - This approach explicitly isolates the conversation history (used solely for intent and coreference resolution) from the retrieved passages (enforced as the exclusive source of factual grounding). - By preventing the language model from misinterpreting prior dialogue turns as factual evidence, the system ranked 3/29 in the SemEval-2026 Task 8 end-to-end evaluation. - Notably, our faithfulness-oriented design achieved a high ROUGE-L F1 score of 0.7692, outperforming larger baselines and demonstrating that explicit grounding constraints are highly effective at ensuring lexical faithfulness and reducing hallucinations.

Co-authors

Venues

SemEval2
WS2

Fix author