Semantic smoothing and fabrication of phrase pairs for SMT

Boxing Chen, Roland Kuhn, George Foster


Abstract
In statistical machine translation systems, phrases with similar meanings often have similar but not identical distributions of translations. This paper proposes a new soft clustering method to smooth the conditional translation probabilities for a given phrase with those of semantically similar phrases. We call this semantic smoothing (SS). Moreover, we fabricate new phrase pairs that were not observed in training data, but which may be used for decoding. In learning curve experiments against a strong baseline, we obtain a consistent pattern of modest improvement from semantic smoothing, and further modest improvement from phrase pair fabrication.
Anthology ID:
2011.iwslt-evaluation.19
Volume:
Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign
Month:
December 8-9
Year:
2011
Address:
San Francisco, California
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
144–150
Language:
URL:
https://aclanthology.org/2011.iwslt-evaluation.19
DOI:
Bibkey:
Cite (ACL):
Boxing Chen, Roland Kuhn, and George Foster. 2011. Semantic smoothing and fabrication of phrase pairs for SMT. In Proceedings of the 8th International Workshop on Spoken Language Translation: Evaluation Campaign, pages 144–150, San Francisco, California.
Cite (Informal):
Semantic smoothing and fabrication of phrase pairs for SMT (Chen et al., IWSLT 2011)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/2011.iwslt-evaluation.19.pdf