Valentino Maiorca
2023
Accelerating Transformer Inference for Translation via Parallel Decoding
Andrea Santilli
|
Silvio Severino
|
Emilian Postolache
|
Valentino Maiorca
|
Michele Mancusi
|
Riccardo Marin
|
Emanuele Rodola
Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
Autoregressive decoding limits the efficiency of transformers for Machine Translation (MT). The community proposed specific network architectures and learning-based methods to solve this issue, which are expensive and require changes to the MT model, trading inference speed at the cost of the translation quality. In this paper, we propose to address the problem from the point of view of decoding algorithms, as a less explored but rather compelling direction. We propose to reframe the standard greedy autoregressive decoding of MT with a parallel formulation leveraging Jacobi and Gauss-Seidel fixed-point iteration methods for fast inference. This formulation allows to speed up existing models without training or modifications while retaining translation quality. We present three parallel decoding algorithms and test them on different languages and models showing how the parallelization introduces a speedup up to 38% w.r.t. the standard autoregressive decoding and nearly 2x when scaling the method on parallel resources. Finally, we introduce a decoding dependency graph visualizer (DDGviz) that let us see how the model has learned the conditional dependence between tokens and inspect the decoding procedure.
2021
WikiNEuRal: Combined Neural and Knowledge-based Silver Data Creation for Multilingual NER
Simone Tedeschi
|
Valentino Maiorca
|
Niccolò Campolungo
|
Francesco Cecconi
|
Roberto Navigli
Findings of the Association for Computational Linguistics: EMNLP 2021
Multilingual Named Entity Recognition (NER) is a key intermediate task which is needed in many areas of NLP. In this paper, we address the well-known issue of data scarcity in NER, especially relevant when moving to a multilingual scenario, and go beyond current approaches to the creation of multilingual silver data for the task. We exploit the texts of Wikipedia and introduce a new methodology based on the effective combination of knowledge-based approaches and neural models, together with a novel domain adaptation technique, to produce high-quality training corpora for NER. We evaluate our datasets extensively on standard benchmarks for NER, yielding substantial improvements up to 6 span-based F1-score points over previous state-of-the-art systems for data creation.
Search
Co-authors
- Andrea Santilli 1
- Emanuele Rodola 1
- Emilian Postolache 1
- Francesco Cecconi 1
- Michele Mancusi 1
- show all...