Assembling a Parallel Corpus from RSS News Feeds

John Fry


Abstract
We describe our use of RSS news feeds to quickly assemble a parallel English-Japanese corpus. Our method is simpler than other web mining approaches, and it produces a parallel corpus whose quality, quantity, and rate of growth are stable and predictable.
Anthology ID:
2005.mtsummit-ebmt.8
Volume:
Workshop on example-based machine translation
Month:
September 13-15
Year:
2005
Address:
Phuket, Thailand
Venue:
MTSummit
SIG:
Publisher:
Note:
Pages:
59–62
Language:
URL:
https://aclanthology.org/2005.mtsummit-ebmt.8
DOI:
Bibkey:
Cite (ACL):
John Fry. 2005. Assembling a Parallel Corpus from RSS News Feeds. In Workshop on example-based machine translation, pages 59–62, Phuket, Thailand.
Cite (Informal):
Assembling a Parallel Corpus from RSS News Feeds (Fry, MTSummit 2005)
Copy Citation:
PDF:
https://preview.aclanthology.org/paclic-22-ingestion/2005.mtsummit-ebmt.8.pdf