Keyi Wang
Papers on this page may belong to the following people: Keyi Wang, Keyi Wang
2026
The African Languages Lab: A Collaborative Approach to Advancing Low-Resource African NLP
Sheriff Issaka | Keyi Wang | Yinka Ajibola | Oluwatumininu Samuel-Ipaye | Zhaoyi Zhang | Nicte Aguillon Jimenez | Evans Kofi Agyei | Abraham Lin | Rohan Ramachandran | Sadick Abdul Mumin | Faith Nchifor | Mohammed Shuraim Issah | Erick Rosas Gonzalez | Lieqi Liu | Sylvester Kpei | Jemimah Kusi Osei | Carlene Ajeneza | Persis Boateng | Prisca Adwoa Dufie Yeboah | Saadia Gabriel
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Sheriff Issaka | Keyi Wang | Yinka Ajibola | Oluwatumininu Samuel-Ipaye | Zhaoyi Zhang | Nicte Aguillon Jimenez | Evans Kofi Agyei | Abraham Lin | Rohan Ramachandran | Sadick Abdul Mumin | Faith Nchifor | Mohammed Shuraim Issah | Erick Rosas Gonzalez | Lieqi Liu | Sylvester Kpei | Jemimah Kusi Osei | Carlene Ajeneza | Persis Boateng | Prisca Adwoa Dufie Yeboah | Saadia Gabriel
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Despite representing nearly one-third of the world’s languages, African languages remain critically underserved by modern NLP technologies, with 88% classified as severely underrepresented or completely ignored in computational linguistics. We present the African Languages Lab (All Lab), a comprehensive research initiative that addresses this technological gap through systematic data collection, model development, and empirical analysis. Our contributions include: (1) a quality-controlled data collection pipeline, yielding the largest validated African multi-modal speech and text dataset spanning 40 languages with 19 billion text tokens and 12,628 hours of aligned speech data; (2) extensive experimental validation demonstrating that even modest-scale models, when fine-tuned on targeted language data, achieve substantial improvements over untrained baselines, averaging +23.69 ChrF++, +0.33 COMET, and +15.34 BLEU points across 31 evaluated languages; and (3) a comparative analysis against Google Translate in which a 1B-parameter model matched or surpassed the commercial system in several languages including Yoruba and Twi, revealing that data scarcity, rather than model scale, constitutes the primary bottleneck for low-resource NLP, and suggesting that systematic dataset development yields disproportionate returns for low-resource languages.
2025
FinNLP-FNP-LLMFinLegal-2025 Shared Task: Regulations Challenge
Keyi Wang | Jaisal Patel | Charlie Shen | Daniel Kim | Andy Zhu | Alex Lin | Luca Borella | Cailean Osborne | Matt White | Steve Yang | Kairong Xiao | Xiao-Yang Liu
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Keyi Wang | Jaisal Patel | Charlie Shen | Daniel Kim | Andy Zhu | Alex Lin | Luca Borella | Cailean Osborne | Matt White | Steve Yang | Kairong Xiao | Xiao-Yang Liu
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Financial large language models (FinLLMs) have been applied to various tasks in business, finance, accounting, and auditing. Complex financial regulations and standards are critical to financial services, which LLMs must comply with. However, FinLLMs’ performance in understanding and interpreting financial regulations has rarely been studied. Therefore, we organize the Regulations Challenge, a shared task at COLING FinNLP-FNP-LLMFinLegal-2025. It encourages the academic community to explore the strengths and limitations of popular LLMs. We create 9 novel tasks and corresponding question sets. In this paper, we provide an overview of these tasks and summarize participants’ approaches and results. We aim to raise awareness of FinLLMs’ professional capability in financial regulations and industry standards.
FinNLP-FNP-LLMFinLegal @ COLING 2025 Shared Task: Agent-Based Single Cryptocurrency Trading Challenge
Yangyang Yu | Haohang Li | Yupeng Cao | Keyi Wang | Zhiyang Deng | Zhiyuan Yao | Yuechen Jiang | Dong Li | Ruey-Ling Weng | Jordan W. Suchow
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Yangyang Yu | Haohang Li | Yupeng Cao | Keyi Wang | Zhiyang Deng | Zhiyuan Yao | Yuechen Jiang | Dong Li | Ruey-Ling Weng | Jordan W. Suchow
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Despite the promise of large language models based agent framework in stock trading task, their capabilities for comprehensive analysis and multiple different financial assets remain largely unexplored, such as cryptocurrency trading. To evaluate the capabilities of LLM-based agent framework in cryptocurrency trading, we introduce an LLMs-based financial shared task featured at COLING 2025 FinNLP-FNP-LLMFinLegal workshop, named Agent-based Single Cryptocurrency Trading Challenge. This challenge includes two cryptocurrencies: BitCoin and Ethereum. In this paper, we provide an overview of these tasks and datasets, summarize participants’ methods, and present their experimental evaluations, highlighting the effectiveness of LLMs in addressing cryptocurrency trading challenges. To the best of our knowledge, the Agent-based Single Cryptocurrency Trading Challenge is one of the first challenges for assessing LLMs in the financial area. In consequence, we provide detailed observations and take away conclusions for future development in this area.
FinNLP-FNP-LLMFinLegal-2025 Shared Task: Financial Misinformation Detection Challenge Task
Zhiwei Liu | Keyi Wang | Zhuo Bao | Xin Zhang | Jiping Dong | Kailai Yang | Mohsinul Kabir | Polydoros Giannouris | Rui Xing | Seongchan Park | Jaehong Kim | Dong Li | Qianqian Xie | Sophia Ananiadou
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Zhiwei Liu | Keyi Wang | Zhuo Bao | Xin Zhang | Jiping Dong | Kailai Yang | Mohsinul Kabir | Polydoros Giannouris | Rui Xing | Seongchan Park | Jaehong Kim | Dong Li | Qianqian Xie | Sophia Ananiadou
Proceedings of the Joint Workshop of the 9th Financial Technology and Natural Language Processing (FinNLP), the 6th Financial Narrative Processing (FNP), and the 1st Workshop on Large Language Models for Finance and Legal (LLMFinLegal)
Despite the promise of large language models (LLMs) in finance, their capabilities for financial misinformation detection (FMD) remain largely unexplored. To evaluate the capabilities of LLMs in FMD task, we introduce the financial misinformation detection shared task featured at COLING FinNLP-FNP-LLMFinLegal-2024, FMD Challenge. This challenge aims to evaluate the ability of LLMs to verify financial misinformation while generating plausible explanations. In this paper, we provide an overview of this task and dataset, summarize participants’ methods, and present their experimental evaluations, highlighting the effectiveness of LLMs in addressing the FMD task. To the best of our knowledge, the FMD Challenge is one of the first challenges for assessing LLMs in the field of FMD. Therefore, we provide detailed observations and draw conclusions for the future development of this field.
Search
Fix author
Co-authors
- Dong Li 2
- Evans Kofi Agyei 1
- Carlene Ajeneza 1
- Yinka Ajibola 1
- Sophia Ananiadou 1
- Zhuo Bao 1
- Persis Boateng 1
- Luca Borella 1
- Yupeng Cao 1
- Zhiyang Deng 1
- Jiping Dong 1
- Saadia Gabriel 1
- Polydoros Giannouris 1
- Erick Rosas Gonzalez 1
- Mohammed Shuraim Issah 1
- Sheriff Issaka 1
- Yuechen Jiang 1
- Nicte Aguillon Jimenez 1
- Mohsinul Kabir 1
- Daniel Kim 1
- Jaehong Kim 1
- Sylvester Kpei 1
- Haohang Li 1
- Alex Lin 1
- Abraham Lin 1
- Xiao-Yang Liu 1
- Zhiwei Liu 1
- Lieqi Liu 1
- Sadick Abdul Mumin 1
- Faith Nchifor 1
- Cailean Osborne 1
- Jemimah Kusi Osei 1
- Seongchan Park 1
- Jaisal Patel 1
- Rohan Ramachandran 1
- Oluwatumininu Samuel-Ipaye 1
- Charlie Shen 1
- Jordan W. Suchow 1
- Ruey-Ling Weng 1
- Matt White 1
- Kairong Xiao 1
- Qianqian Xie 1
- Rui Xing 1
- Steve Yang 1
- Kailai Yang 1
- Zhiyuan Yao 1
- Prisca Adwoa Dufie Yeboah 1
- Yangyang Yu 1
- Xin Zhang 1
- Zhaoyi Zhang 1
- Andy Zhu 1