Ran Iwamoto


2021

pdf
Polar Embedding
Ran Iwamoto | Ryosuke Kohita | Akifumi Wachi
Proceedings of the 25th Conference on Computational Natural Language Learning

Hierarchical relationships are invaluable information for many natural language processing (NLP) tasks. Distributional representation has become a fundamental approach for encoding word relationships, particularly embeddings in hyperbolic space showed great performance in representing hierarchies by taking advantage of their spatial properties. However, most machine learning systems do not suppose to use in such complex non-Euclidean geometries. To achieve hierarchy representations in commonly used Euclidean space, we propose Polar Embedding that learns word embeddings with the polar coordinate system. Utilizing characteristics of polar coordinates, the hierarchy of words is expressed with two independent variables: radius (generality) and angles (similarity), and their variables are optimized separately. Polar embedding shows word hierarchies explicitly and allows us to use beneficial resources such as word frequencies or word generality annotations for computing radiuses. We introduce an optimization method for learning angles in limited ranges of polar coordinates, which combining a loss function controlling gradient and distribution uniformization. Experimental results on hypernymy datasets indicate that our approach outperforms other embeddings in low-dimensional Euclidean space and competitively performs even with hyperbolic embeddings, which possess a geometric advantage.

pdf
A Universal Dependencies Corpora Maintenance Methodology Using Downstream Application
Ran Iwamoto | Hiroshi Kanayama | Alexandre Rademaker | Takuya Ohko
Proceedings of the Third Workshop on Computational Typology and Multilingual NLP

This paper investigates updates of Universal Dependencies (UD) treebanks in 23 languages and their impact on a downstream application. Numerous people are involved in updating UD’s annotation guidelines and treebanks in various languages. However, it is not easy to verify whether the updated resources maintain universality with other language resources. Thus, validity and consistency of multilingual corpora should be tested through application tasks involving syntactic structures with PoS tags, dependency labels, and universal features. We apply the syntactic parsers trained on UD treebanks from multiple versions (2.0 to 2.7) to a clause-level sentiment extractor. We then analyze the relationships between attachment scores of dependency parsers and performance in application tasks. For future UD developments, we show examples of outputs that differ depending on version.

2020

pdf
How Universal are Universal Dependencies? Exploiting Syntax for Multilingual Clause-level Sentiment Detection
Hiroshi Kanayama | Ran Iwamoto
Proceedings of the Twelfth Language Resources and Evaluation Conference

This paper investigates clause-level sentiment detection in a multilingual scenario. Aiming at a high-precision, fine-grained, configurable, and non-biased system for practical use cases, we have designed a pipeline method that makes the most of syntactic structures based on Universal Dependencies, avoiding machine-learning approaches that may cause obstacles to our purposes. We achieved high precision in sentiment detection for 17 languages and identified the advantages of common syntactic structures as well as issues stemming from structural differences on Universal Dependencies. In addition to reusable tips for handling multilingual syntax, we provide a parallel benchmarking data set for further research.

pdf
RIJP at SemEval-2020 Task 1: Gaussian-based Embeddings for Semantic Change Detection
Ran Iwamoto | Masahiro Yukawa
Proceedings of the Fourteenth Workshop on Semantic Evaluation

This paper describes the model proposed and submitted by our RIJP team to SemEval 2020 Task1: Unsupervised Lexical Semantic Change Detection. In the model, words are represented by Gaussian distributions. For Subtask 1, the model achieved average scores of 0.51 and 0.70 in the evaluation and post-evaluation processes, respectively. The higher score in the post-evaluation process than that in the evaluation process was achieved owing to appropriate parameter tuning. The results indicate that the proposed Gaussian-based embedding model is able to express semantic shifts while having a low computational