Yiheng Wu

2025

pdf bib abs
Estimation of Text Difficulty in the Context of Language Learning
Anisia Katinskaia | Anh-Duc Vu | Jue Hou | Ulla Vanhatalo | Yiheng Wu | Roman Yangarber
Proceedings of the 20th Workshop on Innovative Use of NLP for Building Educational Applications (BEA 2025)

Easy language and text simplification are currently topical research questions, with important applications in many contexts, and with various approaches under active investigation, including prompt-based methods. The estimation of the level of difficulty of a text becomes a crucial challenge when the estimator is employed in a simplification workflow as a quality-control mechanism. It can act as a critic in frameworks where it can guide other models, which are responsible for generating text at a specified level of difficulty, as determined by the user’s needs.We present our work in the context of simplified Finnish. We discuss problems in collecting corpora for training models for estimation of text difficulty, and our experiments with estimation models.The results of the experiments are promising: the models appear usable both for assessment and for deployment as a component in a larger simplification framework.

Large language models (LLMs) demonstrate remarkable capabilities in understanding complex tasks and have achieved commendable performance in graph-related tasks, such as node classification, link prediction, and subgraph classification. These tasks primarily depend on the local reasoning capabilities of the graph structure. However, research has yet to address the graph partitioning task that requires global perception abilities. Our preliminary findings reveal that vanilla LLMs can only handle graph partitioning on extremely small-scale graphs. To overcome this limitation, we propose a three-phase pipeline to empower LLMs for large-scale graph partitioning: coarsening, reasoning, and refining. The coarsening phase reduces graph complexity. The reasoning phase captures both global and local patterns to generate a coarse partition. The refining phase ensures topological consistency by projecting the coarse-grained partitioning results back to the original graph structure. Extensive experiments demonstrate that our framework enables LLMs to perform graph partitioning across varying graph scales, validating both the effectiveness of LLMs for partitioning tasks and the practical utility of our proposed methodology.

pdf bib abs
Know-AI at TSAR 2025 Shared Task Difficulty-aware Text Simplification System
Yiheng Wu | Anisia Katinskaia | Jue Hou | Roman Yangarber
Proceedings of the Fourth Workshop on Text Simplification, Accessibility and Readability (TSAR 2025)

Text simplification is an active research topic with applications in multiple domains. In a simplification pipeline, assessment of text difficulty plays a crucial role as a quality control mechanism it acts as a critic and guides models to generate text at the difficulty level required by the user. This paper presents our Difficulty-aware Text Simplification System. We evaluate our pipeline using the TSAR shared task dataset and discuss challenges in constructing corpora for training models to assess text difficulty.

Co-authors

Venues

Fix data

Yiheng Wu

Fixing paper assignments

2025

Co-authors

Venues