Proceedings of the The 2nd Workshop on Multi-lingual Representation Learning (MRL)

Duygu Ataman, Hila Gonen, Sebastian Ruder, Orhan Firat, Gözde Gül Sahin, Jamshidbek Mirzakhalov (Editors)

Anthology ID:
Abu Dhabi, United Arab Emirates (Hybrid)
Association for Computational Linguistics
Bib Export formats:

pdf bib
Proceedings of the The 2nd Workshop on Multi-lingual Representation Learning (MRL)
Duygu Ataman | Hila Gonen | Sebastian Ruder | Orhan Firat | Gözde Gül Sahin | Jamshidbek Mirzakhalov

pdf bib
Entity Retrieval from Multilingual Knowledge Graphs
Saher Esmeir | Arthur Câmara | Edgar Meij

Knowledge Graphs (KGs) are structured databases that capture real-world entities and their relationships. The task of entity retrieval from a KG aims at retrieving a ranked list of entities relevant to a given user query. While English-only entity retrieval has attracted considerable attention, user queries, as well as the information contained in the KG, may be represented in multiple—and possibly distinct—languages. Furthermore, KG content may vary between languages due to different information sources and points of view. Recent advances in language representation have enabled natural ways of bridging gaps between languages. In this paper, we therefore propose to utilise language models (LMs) and diverse entity representations to enable truly multilingual entity retrieval. We propose two approaches: (i) an array of monolingual retrievers and (ii) a single multilingual retriever, trained using queries and documents in multiple languages. We show that while our approach is on par with the significantly more complex state-of-the-art method for the English task, it can be successfully applied to virtually any language with a LM. Furthermore, it allows languages to benefit from one another, yielding significantly better performance, both for low- and high-resource languages.

pdf bib
Few-Shot Cross-Lingual Learning for Event Detection
Luis Guzman Nateras | Viet Lai | Franck Dernoncourt | Thien Nguyen

Cross-Lingual Event Detection (CLED) models are capable of performing the Event Detection (ED) task in multiple languages. Such models are trained using data from a source language and then evaluated on data from a distinct target language. Training is usually performed in the standard supervised setting with labeled data available in the source language. The Few-Shot Learning (FSL) paradigm is yet to be explored for CLED despite its inherent advantage of allowing models to better generalize to unseen event types. As such, in this work, we study the CLED task under an FSL setting. Our contribution is threefold: first, we introduce a novel FSL classification method based on Optimal Transport (OT); second, we present a novel regularization term to incorporate the global distance between the support and query sets; and third, we adapt our approach to the cross-lingual setting by exploiting the alignment between source and target data. Our experiments on three, syntactically-different, target languages show the applicability of our approach and its effectiveness at improving the cross-lingual performance of few-shot models for event detection.

Zero-shot Cross-Lingual Counterfactual Detection via Automatic Extraction and Prediction of Clue Phrases
Asahi Ushio | Danushka Bollegala

Counterfactual statements describe events that did not or cannot take place unless some conditions are satisfied. Existing counterfactual detection (CFD) methods assume the availability of manually labelled statements for each language they consider, limiting the broad applicability of CFD. In this paper, we consider the problem of zero-shot cross-lingual transfer learning for CFD. Specifically, we propose a novel loss function based on the clue phrase prediction for generalising a CFD model trained on a source language to multiple target languages, without requiring any human-labelled data. We obtain clue phrases that express various language-specific lexical indicators of counterfactuality in the target language in an unsupervised manner using a neural alignment model. We evaluate our method on the Amazon Multilingual Counterfactual Dataset (AMCD) for English, German, and Japanese languages in the zero-shot cross-lingual transfer setup where no manual annotations are used for the target language during training. The best CFD model fine-tuned on XLM-R improves the macro F1 score by 25% for German and 20% for Japanese target languages compared to a model that is trained only using English source language data.

Zero-shot Cross-Language Transfer of Monolingual Entity Linking Models
Elliot Schumacher | James Mayfield | Mark Dredze

Most entity linking systems, whether mono or multilingual, link mentions to a single English knowledge base. Few have considered linking non-English text to a non-English KB, and therefore, transferring an English entity linking model to both a new document and KB language. We consider the task of zero-shot cross-language transfer of entity linking systems to a new language and KB. We find that a system trained with multilingual representations does reasonably well, and propose improvements to system training that lead to improved recall in most datasets, often matching the in-language performance. We further conduct a detailed evaluation to elucidate the challenges of this setting.

Rule-Based Clause-Level Morphology for Multiple Languages
Tillmann Dönicke

This paper describes an approach for the morphosyntactic analysis of clauses, including the analysis of composite verb forms and both overt and covert pronouns. The approach uses grammatical rules for verb inflection and clause-internal word agreement to compute a clause’s morphosyntactic features from the morphological features of the individual words. The approach is tested for eight languages in the 1st Shared Task on Multilingual Clause-Level Morphology, where it achieves F1 scores between 79% and 99% (94% in average).

Comparative Analysis of Cross-lingual Contextualized Word Embeddings
Hossain Shaikh Saadi | Viktor Hangya | Tobias Eder | Alexander Fraser

Contextualized word embeddings have emerged as the most important tool for performing NLP tasks in a large variety of languages. In order to improve the cross- lingual representation and transfer learning quality, contextualized embedding alignment techniques, such as mapping and model fine-tuning, are employed. Existing techniques however are time-, data- and computational resource-intensive. In this paper we analyze these techniques by utilizing three tasks: bilingual lexicon induction (BLI), word retrieval and cross-lingual natural language inference (XNLI) for a high resource (German-English) and a low resource (Bengali-English) language pair. In contrast to previous works which focus only on a few popular models, we compare five multilingual and seven monolingual language models and investigate the effect of various aspects on their performance, such as vocabulary size, number of languages used for training and number of parameters. Additionally, we propose a parameter-, data- and runtime-efficient technique which can be trained with 10% of the data, less than 10% of the time and have less than 5% of the trainable parameters compared to model fine-tuning. We show that our proposed method is competitive with resource heavy models, even outperforming them in some cases, even though it relies on less resource

How Language-Dependent is Emotion Detection? Evidence from Multilingual BERT
Luna De Bruyne | Pranaydeep Singh | Orphee De Clercq | Els Lefever | Veronique Hoste

As emotion analysis in text has gained a lot of attention in the field of natural language processing, differences in emotion expression across languages could have consequences for how emotion detection models work. We evaluate the language-dependence of an mBERT-based emotion detection model by comparing language identification performance before and after fine-tuning on emotion detection, and performing (adjusted) zero-shot experiments to assess whether emotion detection models rely on language-specific information. When dealing with typologically dissimilar languages, we found evidence for the language-dependence of emotion detection.

MicroBERT: Effective Training of Low-resource Monolingual BERTs through Parameter Reduction and Multitask Learning
Luke Gessler | Amir Zeldes

BERT-style contextualized word embedding models are critical for good performance in most NLP tasks, but they are data-hungry and therefore difficult to train for low-resource languages. In this work, we investigate whether a combination of greatly reduced model size and two linguistically rich auxiliary pretraining tasks (part-of-speech tagging and dependency parsing) can help produce better BERTs in a low-resource setting. Results from 7 diverse languages indicate that our model, MicroBERT, is able to produce marked improvements in downstream task evaluations, including gains up to 18% for parser LAS and 11% for NER F1 compared to an mBERT baseline, and we achieve these results with less than 1% of the parameter count of a multilingual BERT base–sized model. We conclude that training very small BERTs and leveraging any available labeled data for multitask learning during pretraining can produce models which outperform both their multilingual counterparts and traditional fixed embeddings for low-resource languages.

Transformers on Multilingual Clause-Level Morphology
Emre Can Acikgoz | Tilek Chubakov | Muge Kural | Gözde Şahin | Deniz Yuret

This paper describes the KUIS-AI NLP team’s submission for the 1st Shared Task on Multilingual Clause-level Morphology (MRL2022). We present our work on all three parts of the shared task: inflection, reinflection, and analysis. We mainly explore two approaches: Trans- former models in combination with data augmentation, and exploiting the state-of-the-art language modeling techniques for morphological analysis. Data augmentation leads to a remarkable performance improvement for most of the languages in the inflection task. Prefix-tuning on pretrained mGPT model helps us to adapt reinflection and analysis tasks in a low-data setting. Additionally, we used pipeline architectures using publicly available open-source lemmatization tools and monolingual BERT- based morphological feature classifiers for rein- flection and analysis tasks, respectively. While Transformer architectures with data augmentation and pipeline architectures achieved the best results for inflection and reinflection tasks, pipelines and prefix-tuning on mGPT received the highest results for the analysis task. Our methods achieved first place in each of the three tasks and outperforms mT5-baseline with 89% for inflection, 80% for reflection, and 12% for analysis. Our code 1 is publicly available.

Impact of Sequence Length and Copying on Clause-Level Inflection
Badr Jaidi | Utkarsh Saboo | Xihan Wu | Garrett Nicolai | Miikka Silfverberg

We present the University of British Columbia’s submission to the MRL shared task on multilingual clause-level morphology. Our submission extends word-level inflectional models to the clause-level in two ways: first, by evaluating the role that BPE has on the learning of inflectional morphology, and second, by evaluating the importance of a copy bias obtained through data hallucination. Experiments demonstrate a strong preference for language-tuned BPE and a copy bias over a vanilla transformer. The methods are complementary for inflection and analysis tasks – combined models see error reductions of 38% for inflection and 15.6% for analysis; However, this synergy does not hold for reinflection, which performs best under a BPE-only setting. A deeper analysis of the errors generated by our models illustrates that the copy bias may be too strong - the combined model produces predictions more similar to the copy-influenced system, despite the success of the BPE-model.

Towards Improved Distantly Supervised Multilingual Named-Entity Recognition for Tweets
Ramy Eskander | Shubhanshu Mishra | Sneha Mehta | Sofia Samaniego | Aria Haghighi

Recent low-resource named-entity recognition (NER) work has shown impressive gains by leveraging a single multilingual model trained using distantly supervised data derived from cross-lingual knowledge bases. In this work, we investigate such approaches by leveraging Wikidata to build large-scale NER datasets of Tweets and propose two orthogonal improvements for low-resource NER in the Twitter social media domain: (1) leveraging domain-specific pre-training on Tweets; and (2) building a model for each language family rather than an all-in-one single multilingual model. For (1), we show that mBERT with Tweet pre-training outperforms the state-of-the-art multilingual transformer-based language model, LaBSE, by a relative increase of 34.6% in F1 when evaluated on Twitter data in a language-agnostic multilingual setting. For (2), we show that learning NER models for language families outperforms a single multilingual model by relative increases of 14.1%, 15.8% and 45.3% in F1 when utilizing mBERT, mBERT with Tweet pre-training and LaBSE, respectively. We conduct analyses and present examples for these observed improvements.

Average Is Not Enough: Caveats of Multilingual Evaluation
Matúš Pikuliak | Marian Simko

This position paper discusses the problem of multilingual evaluation. Using simple statistics, such as average language performance, might inject linguistic biases in favor of dominant language families into evaluation methodology. We argue that a qualitative analysis informed by comparative linguistics is needed for multilingual results to detect this kind of bias. We show in our case study that results in published works can indeed be linguistically biased and we demonstrate that visualization based on URIEL typological database can detect it.

The MRL 2022 Shared Task on Multilingual Clause-level Morphology
Omer Goldman | Francesco Tinner | Hila Gonen | Benjamin Muller | Victoria Basmov | Shadrack Kirimi | Lydia Nishimwe | Benoît Sagot | Djamé Seddah | Reut Tsarfaty | Duygu Ataman

The 2022 Multilingual Representation Learning (MRL) Shared Task was dedicated to clause-level morphology. As the first ever benchmark that defines and evaluates morphology outside its traditional lexical boundaries, the shared task on multilingual clause-level morphology sets the scene for competition across different approaches to morphological modeling, with 3 clause-level sub-tasks: morphological inflection, reinflection and analysis, where systems are required to generate, manipulate or analyze simple sentences centered around a single content lexeme and a set of morphological features characterizing its syntactic clause. This year’s tasks covered eight typologically distinct languages: English, French, German, Hebrew, Russian, Spanish, Swahili and Turkish. The tasks has received submissions of four systems from three teams which were compared to two baselines implementing prominent multilingual learning methods. The results show that modern NLP models are effective in solving morphological tasks even at the clause level. However, there is still room for improvement, especially in the task of morphological analysis.