Abstract
Phrases play a key role in Machine Translation (MT). In this paper, we apply a Long Short-Term Memory (LSTM) model over conventional Phrase-Based Statistical MT (PBSMT). The core idea is to use an LSTM encoder-decoder to score the phrase table generated by the PBSMT decoder. Given a source sequence, the encoder and decoder are jointly trained in order to maximize the conditional probability of a target sequence. Analytically, the performance of a PBSMT system is enhanced by using the conditional probabilities of phrase pairs computed by an LSTM encoder-decoder as an additional feature in the existing log-linear model. We compare the performance of the phrase tables in the PBSMT to the performance of the proposed LSTM and observe its positive impact on translation quality. We construct a PBSMT model using the Moses decoder and enrich the Language Model (LM) utilizing an external dataset. We then rank the phrase tables using an LSTM-based encoder-decoder. This method produces a gain of up to 3.14 BLEU score on the test set.- Anthology ID:
- R19-1004
- Volume:
- Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019)
- Month:
- September
- Year:
- 2019
- Address:
- Varna, Bulgaria
- Venue:
- RANLP
- SIG:
- Publisher:
- INCOMA Ltd.
- Note:
- Pages:
- 25–32
- Language:
- URL:
- https://aclanthology.org/R19-1004
- DOI:
- 10.26615/978-954-452-056-4_004
- Cite (ACL):
- Benyamin Ahmadnia and Bonnie Dorr. 2019. Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network. In Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP 2019), pages 25–32, Varna, Bulgaria. INCOMA Ltd..
- Cite (Informal):
- Enhancing Phrase-Based Statistical Machine Translation by Learning Phrase Representations Using Long Short-Term Memory Network (Ahmadnia & Dorr, RANLP 2019)
- PDF:
- https://preview.aclanthology.org/starsem-semeval-split/R19-1004.pdf