When learning new vocabulary, both humans and machines acquire critical information about the meaning of an unfamiliar word through contextual information in a sentence or passage. However, not all contexts are equally helpful for learning an unfamiliar ‘target’ word. Some contexts provide a rich set of semantic clues to the target word’s meaning, while others are less supportive. We explore the task of finding educationally supportive contexts with respect to a given target word for vocabulary learning scenarios, particularly for improving student literacy skills. Because of their inherent context-based nature, attention-based deep learning methods provide an ideal starting point. We evaluate attention-based approaches for predicting the amount of educational support from contexts, ranging from a simple custom model using pre-trained embeddings with an additional attention layer, to a commercial Large Language Model (LLM). Using an existing major benchmark dataset for educational context support prediction, we found that a sophisticated but generic LLM had poor performance, while a simpler model using a custom attention-based approach achieved the best-known performance to date on this dataset.
Knowledge graph embedding aims to represent entities and relations as low-dimensional vectors, which is an effective way for predicting missing links. It is crucial for knowledge graph embedding models to model and infer various relation patterns, such as symmetry/antisymmetry. However, many existing approaches fail to model semantic hierarchies, which are common in the real world. We propose a new model called HRQE, which represents entities as pure quaternions. The relational embedding consists of two parts: (a) Using unit quaternions to represent the rotation part in 3D space, where the head entities are rotated by the corresponding relations through Hamilton product. (b) Using scale parameters to constrain the modulus of entities to make them have hierarchical distributions. To the best of our knowledge, HRQE is the first model that can encode symmetry/antisymmetry, inversion, composition, multiple relation patterns and learn semantic hierarchies simultaneously. Experimental results demonstrate the effectiveness of HRQE against some of the SOTA methods on four well-established knowledge graph completion benchmarks.
Knowledge graph embedding aims to represent entities and relations as low-dimensional vectors, which is an effective way for predicting missing links in knowledge graphs. Designing a strong and effective loss framework is essential for knowledge graph embedding models to distinguish between correct and incorrect triplets. The classic margin-based ranking loss limits the scores of positive and negative triplets to have a suitable margin. The recently proposed Limit-based Scoring Loss independently limits the range of positive and negative triplet scores. However, these loss frameworks use equal or fixed penalty terms to reduce the scores of positive and negative sample pairs, which is inflexible in optimization. Our intuition is that if a triplet score deviates far from the optimum, it should be emphasized. To this end, we propose Adaptive Limit Scoring Loss, which simply re-weights each triplet to highlight the less-optimized triplet scores. We apply this loss framework to several knowledge graph embedding models such as TransE, TransH and ComplEx. The experimental results on link prediction and triplet classification show that our proposed method has achieved performance on par with the state of the art.
To find a suitable embedding for a knowledge graph remains a big challenge nowadays. By using previous knowledge graph embedding methods, every entity in a knowledge graph is usually represented as a k-dimensional vector. As we know, an affine transformation can be expressed in the form of a matrix multiplication followed by a translation vector. In this paper, we firstly utilize a set of affine transformations related to each relation to operate on entity vectors, and then these transformed vectors are used for performing embedding with previous methods. The main advantage of using affine transformations is their good geometry properties with interpretability. Our experimental results demonstrate that the proposed intuitive design with affine transformations provides a statistically significant increase in performance with adding a few extra processing steps or adding a limited number of additional variables. Taking TransE as an example, we employ the scale transformation (the special case of an affine transformation), and only introduce k additional variables for each relation. Surprisingly, it even outperforms RotatE to some extent on various data sets. We also introduce affine transformations into RotatE, Distmult and ComplEx, respectively, and each one outperforms its original method.