Keita Fukushima


Fixing paper assignments

  1. Please select all papers that belong to the same person.
  2. Indicate below which author they should be assigned to.
Provide a valid ORCID iD here. This will be used to match future papers to this author.
Provide the name of the school or the university where the author has received or will receive their highest degree (e.g., Ph.D. institution for researchers, or current affiliation for students). This will be used to form the new author page ID, if needed.

TODO: "submit" and "cancel" buttons here


2025

pdf bib
Reversible Disentanglement of Meaning and Language Representations from Multilingual Sentence Encoders
Keita Fukushima | Tomoyuki Kajiwara | Takashi Ninomiya
Proceedings of the 5th Workshop on Multilingual Representation Learning (MRL 2025)

We propose an unsupervised method to disentangle sentence embeddings from multilingual sentence encoders into language-specific and language-agnostic representations. Such language-agnostic representations distilled by our method can estimate cross-lingual semantic sentence similarity by cosine similarity. Previous studies have trained individual extractors to distill each language-specific and -agnostic representation. This approach suffers from missing information resulting in the original sentence embedding not being fully reconstructed from both language-specific and -agnostic representations; this leads to performance degradation in estimating cross-lingual sentence similarity. We only train the extractor for language-agnostic representations and treat language-specific representations as differences from the original sentence embedding; in this way, there is no missing information. Experimental results for both tasks, quality estimation of machine translation and cross-lingual sentence similarity estimation, show that our proposed method outperforms existing unsupervised methods.