Yang Fang

2026

GQLBench: A Large-Scale Cross-Domain, Cross-Dialect Benchmark for NL2GQL
Yanning Su | Yuhang Zhou | Yang Fang | Sen Liu | Guangnan Ye | Hongfeng Chai
Proceedings of the 64th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Despite growing interest in NL2GQL, benchmarking progress has been constrained by the lack of resources that are simultaneously large-scale, cross-domain, and cross-dialect. To address this gap, we present **GQLBench**, a new benchmark built through an automated and scalable framework that integrates NL2SQL-to-NL2GQL conversion with graph-native data generation. GQLBench supports execution-based evaluation on both Cypher and ISO-GQL, covering hundreds of graph databases and over 20k natural language questions for each dialect. By combining converted data from mature NL2SQL resources with synthetic graph-specific queries, it captures both schema diversity from real-world relational sources and graph-native reasoning challenges, including long paths and cycles. Beyond overall performance comparison, GQLBench also enables fine-grained evaluation across dialects, graph patterns, and query complexity. Experiments on advanced LLMs show that even strong proprietary models struggle on GQLBench, with gemini-3-flash achieving only 35.40% average execution accuracy across the two dialects. Our data and code are available at https://github.com/qxssadf/GQLBench.

2024

pdf bib abs

Advancing Arabic Sentiment Analysis: ArSen Benchmark and the Improved Fuzzy Deep Hybrid Network
Yang Fang | Cheng Xu | Shuhao Guan | Nan Yan | Yuke Mei
Proceedings of the 28th Conference on Computational Natural Language Learning

Sentiment analysis is pivotal in Natural Language Processing for understanding opinions and emotions in text. While advancements in Sentiment analysis for English are notable, Arabic Sentiment Analysis (ASA) lags, despite the growing Arabic online user base. Existing ASA benchmarks are often outdated and lack comprehensive evaluation capabilities for state-of-the-art models. To bridge this gap, we introduce ArSen, a meticulously annotated COVID-19-themed Arabic dataset, and the IFDHN, a novel model incorporating fuzzy logic for enhanced sentiment classification. ArSen provides a contemporary, robust benchmark, and IFDHN achieves state-of-the-art performance on ASA tasks. Comprehensive evaluations demonstrate the efficacy of IFDHN using the ArSen dataset, highlighting future research directions in ASA.

2022

pdf bib abs

Extract-Select: A Span Selection Framework for Nested Named Entity Recognition with Generative Adversarial Training
Peixin Huang | Xiang Zhao | Minghao Hu | Yang Fang | Xinyi Li | Weidong Xiao
Findings of the Association for Computational Linguistics: ACL 2022

Nested named entity recognition (NER) is a task in which named entities may overlap with each other. Span-based approaches regard nested NER as a two-stage span enumeration and classification task, thus having the innate ability to handle this task. However, they face the problems of error propagation, ignorance of span boundary, difficulty in long entity recognition and requirement on large-scale annotated data. In this paper, we propose Extract-Select, a span selection framework for nested NER, to tackle these problems. Firstly, we introduce a span selection framework in which nested entities with different input categories would be separately extracted by the extractor, thus naturally avoiding error propagation in two-stage span-based approaches. In the inference phase, the trained extractor selects final results specific to the given entity category. Secondly, we propose a hybrid selection strategy in the extractor, which not only makes full use of span boundary but also improves the ability of long entity recognition. Thirdly, we design a discriminator to evaluate the extraction result, and train both extractor and discriminator with generative adversarial training (GAT). The use of GAT greatly alleviates the stress on the dataset size. Experimental results on four benchmark datasets demonstrate that Extract-Select outperforms competitive nested NER models, obtaining state-of-the-art results. The proposed model also performs well when less labeled data are given, proving the effectiveness of GAT.

Co-authors

Sen Liu 1

Nan Yan 1

Venues

Fix author