Xing Fu
2025
AIGT: AI Generative Table Based on Prompt
Mingming Zhang
|
Zhiqing Xiao
|
Guoshan Lu
|
Sai Wu
|
Weiqiang Wang
|
Xing Fu
|
Can Yi
|
Junbo Zhao
Proceedings of the 31st International Conference on Computational Linguistics
Tabular data, which accounts for over 80% of enterprise data assets, is vital in various fields. With growing concerns about privacy protection and data-sharing restrictions, generating high-quality synthetic tabular data has become essential. Recent advancements show that large language models (LLMs) can effectively generate realistic tabular data by leveraging semantic information and overcoming the challenges of high-dimensional data that arise from one-hot encoding. However, current methods do not fully utilize the rich information available in tables. To address this, we introduce AI Generative Table based on prompt enhancement, a novel approach that utilizes metadata information, such as table descriptions and schemas, as prompts to generate ultra-high-quality synthetic data. To overcome the token limit constraints of LLMs, we propose long-token partitioning algorithms that enable AIGT to model tables of any scale. AIGT achieves state-of-the-art performance on 14 out of 20 public datasets and two real industry datasets within the Alipay risk control system.
ALPS: Attention Localization and Pruning Strategy for Efficient Adaptation of Large Language Models
Hao Chen
|
Haoze Li
|
Zhiqing Xiao
|
Lirong Gao
|
Qi Zhang
|
Xiaomeng Hu
|
Ningtao Wang
|
Xing Fu
|
Junbo Zhao
Findings of the Association for Computational Linguistics: ACL 2025
Aligning general-purpose large language models (LLMs) to downstream tasks often incurs significant training adjustment costs. Prior research has explored various avenues to enhance alignment efficiency, primarily through minimal-data training or data-driven activations to identify key attention heads. However, these approaches inherently introduce data dependency, which hinders generalization and reusability. To address this issue and enhance model alignment efficiency, we propose the Attention Localization and Pruning Strategy ALPS, an efficient algorithm that localizes the most task-sensitive attention heads and prunes by restricting attention training updates to these heads, thereby reducing alignment costs. Experimental results demonstrate that our method activates only 10% of attention parameters during fine-tuning while achieving a 2% performance improvement over baselines on three tasks. Moreover, the identified task-specific heads are transferable across datasets and mitigate knowledge forgetting. Our work and findings provide a novel perspective on efficient LLM alignment.
Search
Fix author
Co-authors
- Zhiqing Xiao 2
- Junbo Zhao 2
- Hao Chen (陈昊) 1
- Lirong Gao 1
- Xiaomeng Hu 1
- show all...