Abstract
Prepositional phrase attachment (PP attachment) is a major source of ambiguity in English. It poses a substantial challenge to Machine Translation (MT) between English and languages that are not characterized by PP attachment ambiguity. In this paper we present an unsupervised, bilingual, corpus-based approach to the resolution of English PP attachment ambiguity. As data we use aligned linguistic representations of the English and Japanese sentences from a large parallel corpus of technical texts. The premise of our approach is that with large aligned, parsed, bilingual (or multilingual) corpora, languages can learn non-trivial linguistic information from one another with high accuracy. We contend that our approach can be extended to linguistic phenomena other than PP attachment.- Anthology ID:
- 2003.mtsummit-papers.44
- Volume:
- Proceedings of Machine Translation Summit IX: Papers
- Month:
- September 23-27
- Year:
- 2003
- Address:
- New Orleans, USA
- Venue:
- MTSummit
- SIG:
- Publisher:
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/2003.mtsummit-papers.44
- DOI:
- Cite (ACL):
- Lee Schwartz, Takako Aikawa, and Chris Quirk. 2003. Disambiguation of English PP attachment using multilingual aligned data. In Proceedings of Machine Translation Summit IX: Papers, New Orleans, USA.
- Cite (Informal):
- Disambiguation of English PP attachment using multilingual aligned data (Schwartz et al., MTSummit 2003)
- PDF:
- https://preview.aclanthology.org/improve-issue-templates/2003.mtsummit-papers.44.pdf