Hongpu Zhu

2026

Encoding Logical Relations of Chinese Complex Sentences within the Universal Dependencies Framework
Hongpu Zhu | Hongzhi Xu
Proceedings of the Fifteenth Language Resources and Evaluation Conference

Clauses in complex sentences always entail certain logical relations such as conjunctive, causative, and concessive. Such logical relations, however, are not properly represented in the universal dependencies (UD) framework, being collapsed into a adverbial clause (advcl) or clausal complement (ccomp) relation between clausal heads. This study extends the UD framework by encoding 13 logical relations. With the new framework, which is structurally identical to UD, we construct a training corpus containing about 1,769 sentences extracted from Chinese newswire and annotated an existing Chinese corpus (GSD-simp test) in UD as a test set. We trained a BERT-based biaffine parser and fine-tuned the Qwen-3 model with the training corpus and evaluated the models on the UD test data. They are compared against four general purpose LLMs including GPT-4o, GPT-5, Claude 4 and DeepSeek V3.2. We find that the fine-tuned Qwen-3-8B model achieves a UAS/LAS of 0.840/0.757, higher than the BERT-based parser and the general purpose LLMs. The results confirm the feasibility of our framework and highlight the inherent challenges of parsing hierarchical and implicit inter-clause relations.

2025

pdf bib abs

Evaluating Large Language Models for In-Context Learning of Linguistic Patterns In Unseen Low Resource Languages
Hongpu Zhu | Yuqi Liang | Wenjing Xu | Hongzhi Xu
Proceedings of the First Workshop on Language Models for Low-Resource Languages

This paper investigates the ability of Large language Models (LLMs) in capturing linguistic patterns from unseen languages and applying them to translation between the languages and English within an in-context learning framework. Inspired by the International Linguistics Olympiad (IOL), we create test data consisting of translation puzzles between 40 low resource languages and English. We test the LLMs in two different strategies: direct prompting and step-by-step prompting. In the latter, the puzzles are manually decomposed into intermediate steps to allow LLMs learn and apply linguistic rules incrementally. The results show that this strategy can significantly improve the performance of LLMs, achieving comparable or slightly superior results to humans when translating the unseen languages to English. However, LLMs still struggle with translating English into the unseen languages, typically with complex syntactic rules. We further observe that LLMs cannot deal with languages with object-subject and noun-adjective word order compared to others, reflecting the potential impact imposed by typological features of languages in training data.

Co-authors

Venues

Fix author