Thilini Wijesiriwardene
2023
ANALOGICAL - A Novel Benchmark for Long Text Analogy Evaluation in Large Language Models
Thilini Wijesiriwardene
|
Ruwan Wickramarachchi
|
Bimal Gajera
|
Shreeyash Gowaikar
|
Chandan Gupta
|
Aman Chadha
|
Aishwarya Naresh Reganti
|
Amit Sheth
|
Amitava Das
Findings of the Association for Computational Linguistics: ACL 2023
Over the past decade, analogies, in the form of word-level analogies, have played a significant role as an intrinsic measure of evaluating the quality of word embedding methods such as word2vec. Modern large language models (LLMs), however, are primarily evaluated on extrinsic measures based on benchmarks such as GLUE and SuperGLUE, and there are only a few investigations on whether LLMs can draw analogies between long texts. In this paper, we present ANALOGICAL, a new benchmark to intrinsically evaluate LLMs across a taxonomy of analogies of long text with six levels of complexity – (i) word, (ii) word vs. sentence, (iii) syntactic, (iv) negation, (v) entailment, and (vi) metaphor. Using thirteen datasets and three different distance measures, we evaluate the abilities of eight LLMs in identifying analogical pairs in the semantic vector space. Our evaluation finds that it is increasingly challenging for LLMs to identify analogies when going up the analogy taxonomy.
2022
Evaluating Biomedical Word Embeddings for Vocabulary Alignment at Scale in the UMLS Metathesaurus Using Siamese Networks
Goonmeet Bajaj
|
Vinh Nguyen
|
Thilini Wijesiriwardene
|
Hong Yung Yip
|
Vishesh Javangula
|
Amit Sheth
|
Srinivasan Parthasarathy
|
Olivier Bodenreider
Proceedings of the Third Workshop on Insights from Negative Results in NLP
Recent work uses a Siamese Network, initialized with BioWordVec embeddings (distributed word embeddings), for predicting synonymy among biomedical terms to automate a part of the UMLS (Unified Medical Language System) Metathesaurus construction process. We evaluate the use of contextualized word embeddings extracted from nine different biomedical BERT-based models for synonym prediction in the UMLS by replacing BioWordVec embeddings with embeddings extracted from each biomedical BERT model using different feature extraction methods. Finally, we conduct a thorough grid search, which prior work lacks, to find the best set of hyperparameters. Surprisingly, we find that Siamese Networks initialized with BioWordVec embeddings still out perform the Siamese Networks initialized with embedding extracted from biomedical BERT model.