Akio Hayakawa


2024

pdf
An Extensible Massively Multilingual Lexical Simplification Pipeline Dataset using the MultiLS Framework
Matthew Shardlow | Fernando Alva-Manchego | Riza Batista-Navarro | Stefan Bott | Saul Calderon Ramirez | Rémi Cardon | Thomas François | Akio Hayakawa | Andrea Horbach | Anna Huelsing | Yusuke Ide | Joseph Marvin Imperial | Adam Nohejl | Kai North | Laura Occhipinti | Nelson Peréz Rojas | Nishat Raihan | Tharindu Ranasinghe | Martin Solis Salazar | Marcos Zampieri | Horacio Saggion
Proceedings of the 3rd Workshop on Tools and Resources for People with REAding DIfficulties (READI) @ LREC-COLING 2024

We present preliminary findings on the MultiLS dataset, developed in support of the 2024 Multilingual Lexical Simplification Pipeline (MLSP) Shared Task. This dataset currently comprises of 300 instances of lexical complexity prediction and lexical simplification across 10 languages. In this paper, we (1) describe the annotation protocol in support of the contribution of future datasets and (2) present summary statistics on the existing data that we have gathered. Multilingual lexical simplification can be used to support low-ability readers to engage with otherwise difficult texts in their native, often low-resourced, languages.

2022

pdf
JADES: New Text Simplification Dataset in Japanese Targeted at Non-Native Speakers
Akio Hayakawa | Tomoyuki Kajiwara | Hiroki Ouchi | Taro Watanabe
Proceedings of the Workshop on Text Simplification, Accessibility, and Readability (TSAR-2022)

The user-dependency of Text Simplification makes its evaluation obscure. A targeted evaluation dataset clarifies the purpose of simplification, though its specification is hard to define. We built JADES (JApanese Dataset for the Evaluation of Simplification), a text simplification dataset targeted at non-native Japanese speakers, according to public vocabulary and grammar profiles. JADES comprises 3,907 complex-simple sentence pairs annotated by an expert. Analysis of JADES shows that wide and multiple rewriting operations were applied through simplification. Furthermore, we analyzed outputs on JADES from several benchmark systems and automatic and manual scores of them. Results of these analyses highlight differences between English and Japanese in operations and evaluations.