Yuki Fujiwara

2026

This study aims to reveal how accurately Large Language Models (LLMs) can deal with a speaker’s actual utterances and their true feelings behind them in Japanese dialogue. Speakers use not only private thoughts which express one’s true feelings and intentions, but also public statements which convey their intentions while considering the interlocutor’s feelings and social status. While public statements help to maintain interpersonal relationships, they can obscure the speaker’s true intention, potentially leading to misunderstandings. We extended existing Japanese dialogue corpora by annotating public statements and private thoughts responses for each dialogue in the corpora, and then evaluated LLMs’ ability to classify and generate between these two types of expressions. The results of the classification task revealed that the current LLMs do not understand those expressions at all, and that training with our corpus can significantly improve the recognition performance. Furthermore, the results of the generation task demonstrated that generating private thoughts is more difficult than generating public statements, according to both automatic and human evaluations. We release our corpus, which contains 7,964 human-annotated dialogues.

2025

pdf bib abs

EhiMeNLP at TSAR 2025 Shared Task Candidate Generation via Iterative Simplification and Reranking by Readability and Semantic Similarity
Rina Miyata | Koki Horiguchi | Risa Kondo | Yuki Fujiwara | Tomoyuki Kajiwara
Proceedings of the Fourth Workshop on Text Simplification, Accessibility and Readability (TSAR 2025)

We introduce the EhiMeNLP submission, which won the TSAR 2025 Shared Task on Readability-Controlled Text Simplification. Our system employed a two-step strategy of candidate generation and reranking. For candidate generation, we simplified the given text into more readable versions by combining multiple large language models with prompts. Then, for reranking, we selected the best candidate by readability-based filtering and ranking based on semantic similarity to the original text.

Co-authors

Venues

Fix author