Xiang Liu

Other people with similar names: Xiang Liu, Xiang Liu

Unverified author pages with similar names: Xiang Liu

2026

The value alignment of Large Language Models (LLMs) is critical because value is the foundation of LLM decision-making and behavior. Some recent work show that LLMs have similar value rankings. However, little is known about how susceptible LLM value rankings are to external influence and how different values are correlated with each other. In this work, we investigate the plasticity of LLM value systems by examining how their value rankings are influenced by different prompting strategies and exploring the intrinsic relationships between values. To this end, we design 6 different value transformation prompting methods including direct instruction, rubrics, in-context learning, scenario, persuasion, and persona, and benchmark the effectiveness of these methods on 3 different families and totally 8 LLMs. Our main findings include that the value rankings in large LLMs are much more susceptible to external influence than small LLMs, and there are intrinsic correlations between certain values (e.g., Privacy and Respect). Besides, through detailed correlation analysis, we find that the value correlations are more similar between large LLMs of different families than small LLMs of the same family. We also identify that scenario method is the strongest persuader and can help entrench the value rankings.

pdf bib abs

Automated Essay Scoring (AES) is crucial for modern education, particularly with the increasing prevalence of multimodal assessments. However, traditional AES methods struggle with evaluation generalizability and multimodal perception, while even recent Multimodal Large Language Model (MLLM)-based approaches can produce hallucinated justifications and scores misaligned with human judgment. To address the limitations, we introduce CAFES, the first collaborative multi-agent framework specifically designed for AES. It orchestrates three specialized agents: an Initial Scorer for rapid, trait-specific evaluations; a Feedback Pool Manager to aggregate detailed and evidence-grounded feedback; and a Reflective Scorer that iteratively refines scores based on this feedback to enhance human alignment. Extensive experiments, using widely adopted MLLMs, achieve an average relative improvement of 21% in Quadratic Weighted Kappa (QWK) against ground truth, with particularly strong gains in grammatical and lexical diversity. Our proposed CAFES paves the way for an intelligent multimodal AES system. The code and dataset are available at https://anonymous.4open.science/r/CAFES-C87F/.

2025

pdf bib abs

The rapid advancement of perovskite solar cells (PSCs) has led to an exponential growth in research publications, creating an urgent need for efficient knowledge management and reasoning systems in this domain. We present a comprehensive knowledge-enhanced system for PSCs that integrates three key components. First, we develop Perovskite-KG, a domain-specific knowledge graph constructed from 1,517 research papers, containing 23,789 entities and 22,272 relationships. Second, we create two complementary datasets: Perovskite-Chat, comprising 55,101 high-quality question-answer pairs generated through a novel multi-agent framework, and Perovskite-Reasoning, containing 2,217 carefully curated materials science problems. Third, we introduce two specialized large language models: Perovskite-Chat-LLM for domain-specific knowledge assistance and Perovskite-Reasoning-LLM for scientific reasoning tasks. Experimental results demonstrate that our system significantly outperforms existing models in both domain-specific knowledge retrieval and scientific reasoning tasks, providing researchers with effective tools for literature review, experimental design, and complex problem-solving in PSC research.