This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
JessiePinkham
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
We describe the implementation of two new language pairs (English-French and English-German) which use machine-learned sentence realization components instead of hand-written generation components. The resulting systems are evaluated by human evaluators, and in the technical domain, are equal to the quality of highly respected commercial systems. We comment on the difficulties that are encountered when using machine-learned sentence realization in the context of MT.
MSR-MT is an advanced research MT prototype that combines rule-based and statistical techniques with example-based transfer. This hybrid, large-scale system is capable of learning all its knowledge of lexical and phrasal translations directly from data. MSR-MT has undergone rigorous evaluation showing that, trained on a corpus of technical data similar to the test corpus, its output surpasses the quality of best-of-breed commercial MT systems.
Nous présentons dans cet article le système de traduction français-anglais MSR-MT développé à Microsoft dans le groupe de recherche sur le traitement du language (NLP). Ce système est basé sur des analyseurs sophistiqués qui produisent des formes logiques, dans la langue source et la langue cible. Ces formes logiques sont alignées pour produire la base de données du transfert, qui contient les correspondances entre langue source et langue cible, utilisées lors de la traduction. Nous présentons différents stages du développement de notre système, commencé en novembre 2000. Nous montrons que les performances d’octobre 2001 de notre système sont meilleures que celles du système commercial Systran, pour le domaine technique, et décrivons le travail linguistique qui nous a permis d’arriver à cette performance. Nous présentons enfin les résultats préliminaires sur un corpus plus général, les débats parlementaires du corpus du Hansard. Quoique nos résultats ne soient pas aussi concluants que pour le domaine technique, nous sommes convaincues que la résolution des problèmes d’analyse que nous avons identifiés nous permettra d’améliorer notre performance.
Past research has shown that the ideal MT system should be modular and devoid of language pair specific information in its design. We describe here the assembly of TAMTAM (Traduction Automatique Microsoft), the French-English research MT system under development at Microsoft, which was constructed from a combination of pre-existing rule-based components and automatically created components. At this stage, the system has not been adapted either computationally or linguistically to the French-English context and yet it performs only slightly below the French-English Systran system in independent blind human evaluations
We describe MSR-MT, a large-scale example-based machine translation system under development for several language pairs. Trained on aligned English-Spanish technical prose, a blind evaluation shows that MSR-MT’s integration of rule-based parsers, example based processing, and statistical techniques produces translations whose quality in this domain exceeds that of uncustomized commercial MT systems.