Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings

Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira


Abstract
Word embedding is one of the most important components in natural language processing, but interpreting high-dimensional embeddings remains a challenging problem. To address this problem, Independent Component Analysis (ICA) is identified as an effective solution. ICA-transformed word embeddings reveal interpretable semantic axes; however, the order of these axes are arbitrary. In this study, we focus on this property and propose a novel method, Axis Tour, which optimizes the order of the axes. Inspired by Word Tour, a one-dimensional word embedding method, we aim to improve the clarity of the word embedding space by maximizing the semantic continuity of the axes. Furthermore, we show through experiments on downstream tasks that Axis Tour yields better or comparable low-dimensional embeddings compared to both PCA and ICA.
Anthology ID:
2024.findings-emnlp.28
Volume:
Findings of the Association for Computational Linguistics: EMNLP 2024
Month:
November
Year:
2024
Address:
Miami, Florida, USA
Editors:
Yaser Al-Onaizan, Mohit Bansal, Yun-Nung Chen
Venue:
Findings
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
477–506
Language:
URL:
https://aclanthology.org/2024.findings-emnlp.28
DOI:
10.18653/v1/2024.findings-emnlp.28
Bibkey:
Cite (ACL):
Hiroaki Yamagiwa, Yusuke Takase, and Hidetoshi Shimodaira. 2024. Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings. In Findings of the Association for Computational Linguistics: EMNLP 2024, pages 477–506, Miami, Florida, USA. Association for Computational Linguistics.
Cite (Informal):
Axis Tour: Word Tour Determines the Order of Axes in ICA-transformed Embeddings (Yamagiwa et al., Findings 2024)
Copy Citation:
PDF:
https://preview.aclanthology.org/dois-2013-emnlp/2024.findings-emnlp.28.pdf