@inproceedings{libovicky-etal-2018-input,
    title = "Input Combination Strategies for Multi-Source Transformer Decoder",
    author = "Libovick{\'y}, Jind{\v{r}}ich  and
      Helcl, Jind{\v{r}}ich  and
      Mare{\v{c}}ek, David",
    editor = "Bojar, Ond{\v{r}}ej  and
      Chatterjee, Rajen  and
      Federmann, Christian  and
      Fishel, Mark  and
      Graham, Yvette  and
      Haddow, Barry  and
      Huck, Matthias  and
      Yepes, Antonio Jimeno  and
      Koehn, Philipp  and
      Monz, Christof  and
      Negri, Matteo  and
      N{\'e}v{\'e}ol, Aur{\'e}lie  and
      Neves, Mariana  and
      Post, Matt  and
      Specia, Lucia  and
      Turchi, Marco  and
      Verspoor, Karin",
    booktitle = "Proceedings of the Third Conference on Machine Translation: Research Papers",
    month = oct,
    year = "2018",
    address = "Brussels, Belgium",
    publisher = "Association for Computational Linguistics",
    url = "https://preview.aclanthology.org/iwcs-25-ingestion/W18-6326/",
    doi = "10.18653/v1/W18-6326",
    pages = "253--260",
    abstract = "In multi-source sequence-to-sequence tasks, the attention mechanism can be modeled in several ways. This topic has been thoroughly studied on recurrent architectures. In this paper, we extend the previous work to the encoder-decoder attention in the Transformer architecture. We propose four different input combination strategies for the encoder-decoder attention: serial, parallel, flat, and hierarchical. We evaluate our methods on tasks of multimodal translation and translation with multiple source languages. The experiments show that the models are able to use multiple sources and improve over single source baselines."
}Markdown (Informal)
[Input Combination Strategies for Multi-Source Transformer Decoder](https://preview.aclanthology.org/iwcs-25-ingestion/W18-6326/) (Libovický et al., WMT 2018)
ACL