Lee Schwartz


pdf bib
Thai Sentence-Breaking for Large-Scale SMT
Glenn Slayden | Mei-Yuh Hwang | Lee Schwartz
Proceedings of the 1st Workshop on South and Southeast Asian Natural Language Processing


pdf bib
Impact of controlled language on translation quality and post-editing in a statistical machine translation environment
Takako Aikawa | Lee Schwartz | Ronit King | Mo Corston-Oliver | Carmen Lozano
Proceedings of Machine Translation Summit XI: Papers


Multilingual Corpus-based Approach to the Resolution of English –ing
Lee Schwartz | Takako Aikawa
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)


Disambiguation of English PP attachment using multilingual aligned data
Lee Schwartz | Takako Aikawa | Chris Quirk
Proceedings of Machine Translation Summit IX: Papers

Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguistic representations of the English and Japanese sentences from a large parallel corpus of technical texts. The premise of our approach is that with large aligned, parsed, bilingual (or multilingual) corpora, languages can learn non-trivial linguistic information from one another with high accuracy. We contend that our approach can be extended to linguistic phenomena other than PP attachment.


Combining Machine Learning and Rule-based Approaches in Spanish and Japanese Sentence Realization
Maite Melero | Takako Aikawa | Lee Schwartz
Proceedings of the International Natural Language Generation Conference


pdf bib
Generation for multilingual MT
Takako Aikawa | Maite Melero | Lee Schwartz | Andi Wu
Proceedings of Machine Translation Summit VIII

This paper presents an overview of the broad-coverage, application-independent natural language generation component of the NLP system being developed at Microsoft Research. It demonstrates how this component functions within a multilingual Machine Translation system (MSR-MT), using the languages that we are currently working on (English, Spanish, Japanese, and Chinese). Section 1 provides a system description of MSR-MT. Section 2 focuses on the generation component and its set of core rules. Section 3 describes an additional layer of generation rules with examples that address issues specific to MT. Section 4 presents evaluation results in the context of MSR-MT. Section 5 addresses generation issues outside of MT.

Multilingual Sentence Generation
Takako Aikawa | Maite Melero | Lee Schwartz | Andi Wu
Proceedings of the ACL 2001 Eighth European Workshop on Natural Language Generation (EWNLG)