How good are your phrases? Assessing phrase quality with single class classification

Nadi Tomeh, Marco Turchi, Guillaume Wisinewski, Alexandre Allauzen, François Yvon


Abstract
We present a novel translation quality informed procedure for both extraction and scoring of phrase pairs in PBSMT systems. We reformulate the extraction problem in the supervised learning framework. Our goal is twofold. First, We attempt to take the translation quality into account; and second we incorporating arbitrary features in order to circumvent alignment errors. One-Class SVMs and the Mapping Convergence algorithm permit training a single-class classifier to discriminate between useful and useless phrase pairs. Such classifier can be learned from a training corpus that comprises only useful instances. The confidence score, produced by the classifier for each phrase pairs, is employed as a selection criteria. The smoothness of these scores allow a fine control over the size of the resulting translation model. Finally, confidence scores provide a new accuracy-based feature to score phrase pairs. Experimental evaluation of the method shows accurate assessments of phrase pairs quality even for regions in the space of possible phrase pairs that are ignored by other approaches. This enhanced evaluation of phrase pairs leads to improvements in the translation performance as measured by BLEU.
Anthology ID:
2011.iwslt-papers.10
Volume:
Proceedings of the 8th International Workshop on Spoken Language Translation: Papers
Month:
December 8-9
Year:
2011
Address:
San Francisco, California
Editors:
Marcello Federico, Mei-Yuh Hwang, Margit Rödder, Sebastian Stüker
Venue:
IWSLT
SIG:
SIGSLT
Publisher:
Note:
Pages:
261–268
Language:
URL:
https://aclanthology.org/2011.iwslt-papers.10
DOI:
Bibkey:
Cite (ACL):
Nadi Tomeh, Marco Turchi, Guillaume Wisinewski, Alexandre Allauzen, and François Yvon. 2011. How good are your phrases? Assessing phrase quality with single class classification. In Proceedings of the 8th International Workshop on Spoken Language Translation: Papers, pages 261–268, San Francisco, California.
Cite (Informal):
How good are your phrases? Assessing phrase quality with single class classification (Tomeh et al., IWSLT 2011)
Copy Citation:
PDF:
https://preview.aclanthology.org/emnlp-22-attachments/2011.iwslt-papers.10.pdf