Jiahui Zhou


2024

pdf
Sailor: Open Language Models for South-East Asia
Longxu Dou | Qian Liu | Guangtao Zeng | Jia Guo | Jiahui Zhou | Xin Mao | Ziqi Jin | Wei Lu | Min Lin
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations

We present Sailor, a family of open language models ranging from 0.5B to 14B parameters, tailored for South-East Asian (SEA) languages. From Qwen1.5, Sailor models accept 200B to 400B tokens during continual pre-training, primarily covering the languages of English, Chinese, Vietnamese, Thai, Indonesian, Malay, and Lao. The training leverages several techniques, including BPE dropout for improving the model robustness, aggressive data cleaning and deduplication, and small proxy models to optimize the data mixture. Experimental results on four typical tasks indicate that Sailor models demonstrate strong performance across different benchmarks, including commonsense reasoning, question answering, reading comprehension and examination. We share our insights to spark a wider interest in developing large language models for multilingual use cases.

pdf
Unveiling Opinion Evolution via Prompting and Diffusion for Short Video Fake News Detection
Linlin Zong | Jiahui Zhou | Wenmin Lin | Xinyue Liu | Xianchao Zhang | Bo Xu
Findings of the Association for Computational Linguistics: ACL 2024

Short video fake news detection is crucial for combating the spread of misinformation. Current detection methods tend to aggregate features from individual modalities into multimodal features, overlooking the implicit opinions and the evolving nature of opinions across modalities. In this paper, we mine implicit opinions within short video news and promote the evolution of both explicit and implicit opinions across all modalities. Specifically, we design a prompt template to mine implicit opinions regarding the credibility of news from the textual component of videos. Additionally, we employ a diffusion model that encourages the interplay among diverse modal opinions, including those extracted through our implicit opinion prompts. Experimental results on a publicly available dataset for short video fake news detection demonstrate the superiority of our model over state-of-the-art methods.