2025
pdf
bib
abs
UCS-SQL: Uniting Content and Structure for Enhanced Semantic Bridging In Text-to-SQL
Zhenhe Wu
|
Zhongqiu Li
|
JieZhangChinaTele JieZhangChinaTele
|
Zhongjiang He
|
Jian Yang
|
Yu Zhao
|
Ruiyu Fang
|
Bing Wang
|
Hongyan Xie
|
Shuangyong Song
|
Zhoujun Li
Findings of the Association for Computational Linguistics: ACL 2025
With the rapid advancement of large language models (LLMs), recent researchers have increasingly focused on the superior capabilities of LLMs in text/code understanding and generation to tackle text-to-SQL tasks. Traditional approaches adopt schema linking to first eliminate redundant tables and columns and prompt LLMs for SQL generation. However, they often struggle with accurately identifying corresponding tables and columns, due to discrepancies in naming conventions between natural language questions (NL) and database schemas. Besides, existing methods overlook the challenge of effectively transforming structure information from NL into SQL. To address these limitations, we introduce UCS-SQL, a novel text-to-SQL framework, uniting both content and structure pipes to bridge the gap between NL and SQL. Specifically, the content pipe focuses on identifying key content within the original content, while the structure pipe is dedicated to transforming the linguistic structure from NL to SQL. Additionally, we strategically selects few-shot examples by considering both the SQL Skeleton and Question Expression (SS-QE selection method), thus providing targeted examples for SQL generation. Experimental results on BIRD and Spider demonstrate the effectiveness of our UCS-SQL framework.
pdf
bib
abs
INT: Establishing Information Transfer for Multilingual Intent Detection and Slot Filling
Di Wu
|
Liting Jiang
|
Bohui Mao
|
Hongyan Xie
|
Haoxiang Su
|
Zhongjiang He
|
Ruiyu Fang
|
Shuangyong Song
|
Hao Huang
|
Xuelong Li
Findings of the Association for Computational Linguistics: ACL 2025
Multilingual spoken language understanding (SLU) involves intent detection (ID) and slot filling (SF) across multiple languages. The inherent linguistic diversity presents significant challenges in achieving performance comparable to traditional SLU. Recent studies have attempted to improve multilingual SLU performance by sharing multilingual encoders. However, these approaches have not directly established information flow between languages. To address this, we first demonstrate the feasibility of such information transfer and pinpoint the key challenges: prediction error mitigation and multilingual slot alignment. We then propose the INformation Transfer network (INT) to tackle these challenges. The gate unit in INT controls the information flow between languages, reducing the adverse impact of prediction errors on both ID and SF. Additionally, we reformulate SF as a span prediction problem and introduce a slot-matching attention mechanism to achieve slot alignment across languages. Experimental results on the MASSIVE and MASSIVE-UG datasets show that our model outperforms all baselines in overall accuracy across all languages, and demonstrates robust performance when different languages are used as the source.
2024
pdf
bib
abs
Dual Prompt Tuning based Contrastive Learning for Hierarchical Text Classification
Sishi Xiong
|
Yu Zhao
|
Jie Zhang
|
Li Mengxiang
|
Zhongjiang He
|
Xuelong Li
|
Shuangyong Song
Findings of the Association for Computational Linguistics: ACL 2024
Hierarchical text classification aims at categorizing texts into a multi-tiered tree-structured hierarchy of labels. Existing methods pay more attention to capture hierarchy-aware text feature by exploiting explicit parent-child relationships, while interactions between peer labels are rarely taken into account, resulting in severe label confusion within each layer. In this work, we propose a novel Dual Prompt Tuning (DPT) method, which emphasizes identifying discrimination among peer labels by performing contrastive learning on each hierarchical layer. We design an innovative hand-crafted prompt containing slots for both positive and negative label predictions to cooperate with contrastive learning. In addition, we introduce a label hierarchy self-sensing auxiliary task to ensure cross-layer label consistency. Extensive experiments demonstrate that DPT achieves significant improvements and outperforms the current state-of-the-art methods on BGC and RCV1-V2 benchmark datasets.
pdf
bib
abs
Sentence Segmentation and Punctuation for Ancient Books Based on Supervised In-context Training
Shiquan Wang
|
Weiwei Fu
|
Mengxiang Li
|
Zhongjiang He
|
Yongxiang Li
|
Ruiyu Fang
|
Li Guan
|
Shuangyong Song
Proceedings of the Third Workshop on Language Technologies for Historical and Ancient Languages (LT4HALA) @ LREC-COLING-2024
This paper describes the participation of team “TeleAI” in the third International Chinese Ancient Chinese Language Information Processing Evaluation (EvalHan24). The competition comprises a joint task of sentence segmentation and punctuation, categorized into open and closed tracks based on the models and data used. In the final evaluation, our system achieved significantly better results than the baseline. Specifically, in the closed-track sentence segmentation task, we obtained an F1 score of 0.8885, while in the sentence punctuation task, we achieved an F1 score of 0.7129.
pdf
bib
abs
TeleChat: An Open-source Billingual Large Language Model
Zihan Wang
|
Liuxz2@chinatelecom.cn Liuxz2@chinatelecom.cn
|
Liusx14@chinatelecom.cn Liusx14@chinatelecom.cn
|
Yitong Yao
|
Huangyy121@chinatelecom.cn Huangyy121@chinatelecom.cn
|
Li Mengxiang
|
Zhongjiang He
|
Liyx25@chinatelecom.cn Liyx25@chinatelecom.cn
|
Pulw@chinatelecom.cn Pulw@chinatelecom.cn
|
Xuhn@chinatelecom.cn Xuhn@chinatelecom.cn
|
Chao Wang
|
Shuangyong Song
Proceedings of the 10th SIGHAN Workshop on Chinese Language Processing (SIGHAN-10)
In this paper, we present TeleChat, a collection of large language models (LLMs) with parameters of 7 billion and 12 billion. TeleChat is initially pretrained on an extensive corpus containing a diverse collection of texts from both English and Chinese languages, encompassing trillions of tokens. Subsequently, the model undergoes fine-tuning to align with human preferences, following a detailed methodology that we describe. We evaluate the performance of TeleChat on various tasks, including general dialogue generation, language understanding, mathematics, reasoning, code generation, and knowledge-based question answering. Our findings indicate that TeleChat achieves state-of-the-art performance to other open-source models of similar size across a wide range of public benchmarks. To support future research and applications utilizing LLMs, we release the fine-tuned model checkpoints of TeleChat-7B and TeleChat-12B, along with code and a portion of our filtered high-quality pretraining data, to the public community.