Xianglong Yu


2023

pdf
PromptARA: Improving Deep Representation in Hybrid Automatic Readability Assessment with Prompt and Orthogonal Projection
Jinshan Zeng | Xianglong Yu | Xianchao Tong | Wenyan Xiao
Findings of the Association for Computational Linguistics: EMNLP 2023

Readability assessment aims to automatically classify texts based on readers’ reading levels. The hybrid automatic readability assessment (ARA) models using both deep and linguistic features have attracted rising attention in recent years due to their impressive performance. However, deep features are not fully explored due to the scarcity of training data, and the fusion of deep and linguistic features is not very effective in existing hybrid ARA models. In this paper, we propose a novel hybrid ARA model called PromptARA through employing prompts to improve deep feature representations and an orthogonal projection layer to fuse both deep and linguistic features. A series of experiments are conducted over four English and two Chinese corpora to show the effectiveness of the proposed model. Experimental results demonstrate that the proposed model is superior to state-of-the-art models.

2022

pdf
Enhancing Automatic Readability Assessment with Pre-training and Soft Labels for Ordinal Regression
Jinshan Zeng | Yudong Xie | Xianglong Yu | John Lee | Ding-Xuan Zhou
Findings of the Association for Computational Linguistics: EMNLP 2022

The readability assessment task aims to assign a difficulty grade to a text. While neural models have recently demonstrated impressive performance, most do not exploit the ordinal nature of the difficulty grades, and make little effort for model initialization to facilitate fine-tuning. We address these limitations with soft labels for ordinal regression, and with model pre-training through prediction of pairwise relative text difficulty. We incorporate these two components into a model based on hierarchical attention networks, and evaluate its performance on both English and Chinese datasets. Experimental results show that our proposed model outperforms competitive neural models and statistical classifiers on most datasets.