Reon Kajikawa
2025
Text Normalization for Japanese Sentiment Analysis
Risa Kondo
|
Ayu Teramen
|
Reon Kajikawa
|
Koki Horiguchi
|
Tomoyuki Kajiwara
|
Takashi Ninomiya
|
Hideaki Hayashi
|
Yuta Nakashima
|
Hajime Nagahara
Proceedings of the Tenth Workshop on Noisy and User-generated Text
We manually normalize noisy Japanese expressions on social networking services (SNS) to improve the performance of sentiment polarity classification.Despite advances in pre-trained language models, informal expressions found in social media still plague natural language processing.In this study, we analyzed 6,000 posts from a sentiment analysis corpus for Japanese SNS text, and constructed a text normalization taxonomy consisting of 33 types of editing operations.Text normalization according to our taxonomy significantly improved the performance of BERT-based sentiment analysis in Japanese.Detailed analysis reveals that most types of editing operations each contribute to improve the performance of sentiment analysis.
2024
Multi-Source Text Classification for Multilingual Sentence Encoder with Machine Translation
Reon Kajikawa
|
Keiichiro Yamada
|
Tomoyuki Kajiwara
|
Takashi Ninomiya
Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 4: Student Research Workshop)
To reduce the cost of training models for each language for developers of natural language processing applications, pre-trained multilingual sentence encoders are promising.However, since training corpora for such multilingual sentence encoders contain only a small amount of text in languages other than English, they suffer from performance degradation for non-English languages.To improve the performance of pre-trained multilingual sentence encoders for non-English languages, we propose a method of machine translating a source sentence into English and then inputting it together with the source sentence in a multi-source manner.Experimental results on sentiment analysis and topic classification tasks in Japanese revealed the effectiveness of the proposed method.
Search
Fix data
Co-authors
- Tomoyuki Kajiwara 2
- Takashi Ninomiya 2
- Hideaki Hayashi 1
- Koki Horiguchi 1
- Risa Kondo 1
- show all...