Dianqing Lin
2026
Who Wrote This Line? Evaluating the Detection of LLM-Generated Classical Chinese Poetry
Jiang Li | Tian Lan | Shanshan Wang | Zdongxing | Dianqing Lin | Guanglai Gao | Derek F. Wong | Xiangdong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Jiang Li | Tian Lan | Shanshan Wang | Zdongxing | Dianqing Lin | Guanglai Gao | Derek F. Wong | Xiangdong Su
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
The rapid development of large language models (LLMs) has extended text generation tasks into the literary domain. However, AI-generated literary creations has raised increasingly prominent issues of creative authenticity and ethics in literary world, making the detection of LLM-generated literary texts essential and urgent. While previous works have made significant progress in detecting AI-generated text, it has yet to address classical Chinese poetry. Due to the unique linguistic features of classical Chinese poetry, such as strict metrical regularity, a shared system of poetic imagery, and flexible syntax, distinguishing whether a poem is authored by AI presents a substantial challenge. To address these issues, we introduce ChangAn, a benchmark for detecting LLM-generated classical Chinese poetry that containing total 30,664 poems, 10,276 are human-written poems and 20,388 poems are generated by four popular LLMs. Based on ChangAn, we conducted a systematic evaluation of 12 AI detectors, investigating their performance variations across different text granularities and generation strategies. Our findings highlight the limitations of current Chinese text detectors, which fail to serve as reliable tools for detecting LLM-generated classical Chinese poetry. These results validate the effectiveness and necessity of our proposed ChangAn benchmark. Our dataset and code are available at https://github.com/VelikayaScarlet/ChangAn.
Exploring the Capability Boundaries of LLMs in Mastering of Chinese Chouxiang Language
Dianqing Lin | Tian Lan | Jiali Zhu | Jiang Li | Wei Chen | Xu Liu | Aruukhan | Xiangdong Su | Hongxu Hou | Guanglai Gao
Findings of the Association for Computational Linguistics: ACL 2026
Dianqing Lin | Tian Lan | Jiali Zhu | Jiang Li | Wei Chen | Xu Liu | Aruukhan | Xiangdong Su | Hongxu Hou | Guanglai Gao
Findings of the Association for Computational Linguistics: ACL 2026
2025
Can Large Language Models Translate Unseen Languages in Underrepresented Scripts?
Dianqing Lin | Aruukhan | Hongxu Hou | Shuo Sun | Wei Chen | Yichen Yang | Guodong Shi
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Dianqing Lin | Aruukhan | Hongxu Hou | Shuo Sun | Wei Chen | Yichen Yang | Guodong Shi
Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing
Large language models (LLMs) have demonstrated impressive performance in machine translation, but still struggle with unseen low-resource languages, especially those written in underrepresented scripts. To investigate whether LLMs can translate such languages with the help of linguistic resources, we introduce Lotus, a benchmark designed to evaluate translation for Mongolian (in traditional script) and Yi. Our study shows that while linguistic resources can improve translation quality as measured by automatic metrics, LLMs remain limited in their ability to handle these languages effectively. We hope our work provides insights for the low-resource NLP community and fosters further progress in machine translation for underrepresented script low-resource languages. Our code and data are available.