Martine Pettenaro


Using decision trees to learn lexical information in a linguistics-based NLP system
Marisa Jiménez | Martine Pettenaro
Actes de la 10ème conférence sur le Traitement Automatique des Langues Naturelles. Posters

This paper describes the use of decision trees to learn lexical information for the enrichment of our natural language processing (NLP) system. Our approach to lexical learning differs from other approaches in the field in that our machine learning techniques exploit a deep knowledge understanding system. After the introduction we present the overall architecture of our lexical learning module. In the following sections we present a showcase of lexical learning using decision trees: we learn verbs that take a human subject in Spanish and French.

High quality machine translation using a machine-learned sentence realization component
Martine Smets | Michael Gamon | Jessie Pinkham | Tom Reutter | Martine Pettenaro
Proceedings of Machine Translation Summit IX: Papers

We describe the implementation of two new language pairs (English-French and English-German) which use machine-learned sentence realization components instead of hand-written generation components. The resulting systems are evaluated by human evaluators, and in the technical domain, are equal to the quality of highly respected commercial systems. We comment on the difficulties that are encountered when using machine-learned sentence realization in the context of MT.


Rapid assembly of a large-scale French-English MT system
Jessie Pinkham | Monica Corston-Oliver | Martine Smets | Martine Pettenaro
Proceedings of Machine Translation Summit VIII

Past research has shown that the ideal MT system should be modular and devoid of language pair specific information in its design. We describe here the assembly of TAMTAM (Traduction Automatique Microsoft), the French-English research MT system under development at Microsoft, which was constructed from a combination of pre-existing rule-based components and automatically created components. At this stage, the system has not been adapted either computationally or linguistically to the French-English context and yet it performs only slightly below the French-English Systran system in independent blind human evaluations