Andrea Gesmundo


2014

pdf bib
Undirected Machine Translation with Discriminative Reinforcement Learning
Andrea Gesmundo | James Henderson
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Projecting the Knowledge Graph to Syntactic Parsing
Andrea Gesmundo | Keith Hall
Proceedings of the 14th Conference of the European Chapter of the Association for Computational Linguistics, volume 2: Short Papers

2012

pdf
Lemmatising Serbian as Category Tagging with Bidirectional Sequence Classification
Andrea Gesmundo | Tanja Samardžić
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

We present a novel tool for morphological analysis of Serbian, which is a low-resource language with rich morphology. Our tool produces lemmatisation and morphological analysis reaching accuracy that is considerably higher compared to the existing alternative tools: 83.6% relative error reduction on lemmatisation and 8.1% relative error reduction on morphological analysis. The system is trained on a small manually annotated corpus with an approach based on Bidirectional Sequence Classification and Guided Learning techniques, which have recently been adapted with success to a broad set of NLP tagging tasks. In the system presented in this paper, this general approach to tagging is applied to the lemmatisation task for the first time thanks to our novel formulation of lemmatisation as a category tagging task. We show that learning lemmatisation rules from annotated corpus and integrating the context information in the process of morphological analysis provides a state-of-the-art performance despite the lack of resources. The proposed system can be used via a web GUI that deploys its best scoring configuration

pdf
Machine Translation of Labeled Discourse Connectives
Thomas Meyer | Andrei Popescu-Belis | Najeh Hajlaoui | Andrea Gesmundo
Proceedings of the 10th Conference of the Association for Machine Translation in the Americas: Research Papers

This paper shows how the disambiguation of discourse connectives can improve their automatic translation, while preserving the overall performance of statistical MT as measured by BLEU. State-of-the-art automatic classifiers for rhetorical relations are used prior to MT to label discourse connectives that signal those relations. These labels are used for MT in two ways: (1) by augmenting factored translation models; and (2) by using the probability distributions of labels in order to train and tune SMT. The improvement of translation quality is demonstrated using a new semi-automated metric for discourse connectives, on the English/French WMT10 data, while BLEU scores remain comparable to non-discourse-aware systems, due to the low frequency of discourse connectives.

pdf
HadoopPerceptron: a Toolkit for Distributed Perceptron Training and Prediction with MapReduce
Andrea Gesmundo | Nadi Tomeh
Proceedings of the Demonstrations at the 13th Conference of the European Chapter of the Association for Computational Linguistics

pdf
Heuristic Cube Pruning in Linear Time
Andrea Gesmundo | Giorgio Satta | James Henderson
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

pdf
Lemmatisation as a Tagging Task
Andrea Gesmundo | Tanja Samardžić
Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers)

2011

pdf
Classification de séquences bidirectionnelles pour des tâches d’étiquetage par apprentissage guidé (Bidirectional Sequence Classification for Tagging Tasks with Guided Learning)
Andrea Gesmundo
Actes de la 18e conférence sur le Traitement Automatique des Langues Naturelles. Articles courts

Dans cet article nous présentons une série d’adaptations de l’algorithme du “cadre d’apprenstissage guidé” pour résoudre différentes tâches d’étiquetage. La spécificité du système proposé réside dans sa capacité à apprendre l’ordre de l’inférence avec les paramètres du classifieur local au lieu de la forcer dans un ordre pré-défini (de gauche à droite). L’algorithme d’entraînement est basé sur l’algorithme du “perceptron”. Nous appliquons le système à différents types de tâches d’étiquetage pour atteindre des résultats au niveau de l’état de l’art en un court temps d’exécution.

pdf
Heuristic Search for Non-Bottom-Up Tree Structure Prediction
Andrea Gesmundo | James Henderson
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

2010

pdf
Faster cube pruning
Andrea Gesmundo | James Henderson
Proceedings of the 7th International Workshop on Spoken Language Translation: Papers

2009

pdf
A Latent Variable Model of Synchronous Syntactic-Semantic Parsing for Multiple Languages
Andrea Gesmundo | James Henderson | Paola Merlo | Ivan Titov
Proceedings of the Thirteenth Conference on Computational Natural Language Learning (CoNLL 2009): Shared Task