Milan Gritta


2021

pdf bib
Conversation Graph: Data Augmentation, Training, and Evaluation for Non-Deterministic Dialogue Management
Milan Gritta | Gerasimos Lampouras | Ignacio Iacobacci
Transactions of the Association for Computational Linguistics, Volume 9

Task-oriented dialogue systems typically rely on large amounts of high-quality training data or require complex handcrafted rules. However, existing datasets are often limited in size con- sidering the complexity of the dialogues. Additionally, conventional training signal in- ference is not suitable for non-deterministic agent behavior, namely, considering multiple actions as valid in identical dialogue states. We propose the Conversation Graph (ConvGraph), a graph-based representation of dialogues that can be exploited for data augmentation, multi- reference training and evaluation of non- deterministic agents. ConvGraph generates novel dialogue paths to augment data volume and diversity. Intrinsic and extrinsic evaluation across three datasets shows that data augmentation and/or multi-reference training with ConvGraph can improve dialogue success rates by up to 6.4%.

pdf bib
Enhancing Transformers with Gradient Boosted Decision Trees for NLI Fine-Tuning
Benjamin Minixhofer | Milan Gritta | Ignacio Iacobacci
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

pdf bib
XeroAlign: Zero-shot cross-lingual transformer alignment
Milan Gritta | Ignacio Iacobacci
Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021

2018

pdf bib
Which Melbourne? Augmenting Geocoding with Maps
Milan Gritta | Mohammad Taher Pilehvar | Nigel Collier
Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

The purpose of text geolocation is to associate geographic information contained in a document with a set (or sets) of coordinates, either implicitly by using linguistic features and/or explicitly by using geographic metadata combined with heuristics. We introduce a geocoder (location mention disambiguator) that achieves state-of-the-art (SOTA) results on three diverse datasets by exploiting the implicit lexical clues. Moreover, we propose a new method for systematic encoding of geographic metadata to generate two distinct views of the same text. To that end, we introduce the Map Vector (MapVec), a sparse representation obtained by plotting prior geographic probabilities, derived from population figures, on a World Map. We then integrate the implicit (language) and explicit (map) features to significantly improve a range of metrics. We also introduce an open-source dataset for geoparsing of news events covering global disease outbreaks and epidemics to help future evaluation in geoparsing.

2017

pdf bib
Vancouver Welcomes You! Minimalist Location Metonymy Resolution
Milan Gritta | Mohammad Taher Pilehvar | Nut Limsopatham | Nigel Collier
Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

Named entities are frequently used in a metonymic manner. They serve as references to related entities such as people and organisations. Accurate identification and interpretation of metonymy can be directly beneficial to various NLP applications, such as Named Entity Recognition and Geographical Parsing. Until now, metonymy resolution (MR) methods mainly relied on parsers, taggers, dictionaries, external word lists and other handcrafted lexical resources. We show how a minimalist neural approach combined with a novel predicate window method can achieve competitive results on the SemEval 2007 task on Metonymy Resolution. Additionally, we contribute with a new Wikipedia-based MR dataset called RelocaR, which is tailored towards locations as well as improving previous deficiencies in annotation guidelines.