Yibin Xu
2025
LLMsPark: A Benchmark for Evaluating Large Language Models in Strategic Gaming Contexts
Junhao Chen
|
Jingbo Sun
|
Xiang Li
|
Haidong Xin
|
Yuhao Xue
|
Yibin Xu
|
Hao Zhao
Findings of the Association for Computational Linguistics: EMNLP 2025
As large language models (LLMs) advance across diverse tasks, the need for comprehensive evaluation beyond single metrics becomes increasingly important.To fully assess LLM intelligence, it is crucial to examine their interactive dynamics and strategic behaviors.We present LLMsPark, a game theory–based evaluation platform that measures LLMs’ decision-making strategies and social behaviors in classic game-theoretic settings, providing a multi-agent environment to explore strategic depth.Our system cross-evaluates 15 leading LLMs (both commercial and open-source) using leaderboard rankings and scoring mechanisms. Higher scores reflect stronger reasoning and strategic capabilities, revealing distinct behavioral patterns and performance differences across models.This work introduces a novel perspective for evaluating LLMs’ strategic intelligence, enriching existing benchmarks and broadening their assessment in interactive, game-theoretic scenarios.The benchmark and rankings are publicly available at https://llmsparks.github.io/.
2020
Global Context-enhanced Graph Convolutional Networks for Document-level Relation Extraction
Huiwei Zhou
|
Yibin Xu
|
Weihong Yao
|
Zhe Liu
|
Chengkun Lang
|
Haibin Jiang
Proceedings of the 28th International Conference on Computational Linguistics
Document-level Relation Extraction (RE) is particularly challenging due to complex semantic interactions among multiple entities in a document. Among exiting approaches, Graph Convolutional Networks (GCN) is one of the most effective approaches for document-level RE. However, traditional GCN simply takes word nodes and adjacency matrix to represent graphs, which is difficult to establish direct connections between distant entity pairs. In this paper, we propose Global Context-enhanced Graph Convolutional Networks (GCGCN), a novel model which is composed of entities as nodes and context of entity pairs as edges between nodes to capture rich global context information of entities in a document. Two hierarchical blocks, Context-aware Attention Guided Graph Convolution (CAGGC) for partially connected graphs and Multi-head Attention Guided Graph Convolution (MAGGC) for fully connected graphs, could take progressively more global context into account. Meantime, we leverage a large-scale distantly supervised dataset to pre-train a GCGCN model with curriculum learning, which is then fine-tuned on the human-annotated dataset for further improving document-level RE performance. The experimental results on DocRED show that our model could effectively capture rich global context information in the document, leading to a state-of-the-art result. Our code is available at https://github.com/Huiweizhou/GCGCN.
Search
Fix author
Co-authors
- Junhao Chen 1
- Haibin Jiang 1
- Chengkun Lang 1
- Xiang Li (李翔) 1
- Zhe Liu 1
- show all...