Zhao Chen
Also published as: 钊 陈
2025
QiMeng-Attention: SOTA Attention Operator is generated by SOTA Attention Algorithm
Qirui Zhou
|
Shaohui Peng
|
Weiqiang Xiong
|
Haixin Chen
|
Yuanbo Wen
|
Haochen Li
|
Ling Li
|
Qi Guo
|
Yongwei Zhao
|
Ke Gao
|
Ruizhi Chen
|
Yanjun Wu
|
Zhao Chen
|
Yunji Chen
Findings of the Association for Computational Linguistics: ACL 2025
The attention operator remains a critical performance bottleneck in large language models (LLMs), particularly for long-context scenarios. While FlashAttention is the most widely used and effective GPU-aware acceleration algorithm, it must require time-consuming and hardware-specific manual implementation, limiting adaptability across GPU architectures. Existing LLMs have shown a lot of promise in code generation tasks, but struggle to generate high-performance attention code. The key challenge is it cannot comprehend the complex data flow and computation process of the attention operator and utilize low-level primitive to exploit GPU performance.To address the above challenge, we propose an LLM-friendly Thinking Language (LLM-TL) to help LLMs decouple the generation of high-level optimization logic and low-level implementation on GPU, and enhance LLMs’ understanding of attention operator.Along with a 2-stage reasoning workflow, TL-Code generation and translation, the LLMs can automatically generate FlashAttention implementation on diverse GPUs, establishing a self-optimizing paradigm for generating high-performance attention operators in attention-centric algorithms.Verified on A100, RTX8000, and T4 GPUs, the performance of our methods significantly outshines that of vanilla LLMs, achieving a speed-up of up to 35.16×.Besides, our method not only surpasses human-optimized libraries (cuDNN and official library) in most scenarios but also extends support to unsupported hardware and data types, reducing development time from months to minutes compared with human experts.
2021
近十年来澳门的词汇增长(Macau’s Vocabulary Growth in the Recent Ten Year)
Shan Wang (王珊)
|
Zhao Chen (陈钊)
|
Haodi Zhang (张昊迪)
Proceedings of the 20th Chinese National Conference on Computational Linguistics
词汇增长模型可以通过拟合词种(types)与词例(tokens)之间的数量关系,反映某一领域词汇的历时演化。澳门作为多语言多文化融合之地,词汇的使用情况能够反映社会的关注焦点,但目前尚无对澳门历时词汇演变的研究。本文首次构建澳门汉语历时语料库,利用三大词汇增长模型拟合语料库的词汇变化,并选取效果最好的 Heaps 模型进一步分析词汇演变与报刊内容的关系,结果反映出澳门词汇的变化趋势与热点新闻、澳门施政方针和民生密切相关。本研究还采用去除文本时序信息后的乱序文本,验证了方法的有效性。本文是首项基于大规模历时语料库考察澳门词汇演变的研究,对深入了解澳门语言生活的发展具有重要意义。
2015
A Joint Model for Chinese Microblog Sentiment Analysis
Yuhui Cao
|
Zhao Chen
|
Ruifeng Xu
|
Tao Chen
|
Lin Gui
Proceedings of the Eighth SIGHAN Workshop on Chinese Language Processing