Ralf D. Brown

Also published as: Ralf Brown


2014

pdf
Non-linear Mapping for Improved Identification of 1300+ Languages
Ralf Brown
Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP)

2011

pdf bib
Training Machine Translation with a Second-Order Taylor Approximation of Weighted Translation Instances
Aaron Phillips | Ralf Brown
Proceedings of Machine Translation Summit XIII: Papers

2010

pdf
Taming Structured Perceptrons on Wild Feature Vectors
Ralf Brown
Proceedings of the Joint Fifth Workshop on Statistical Machine Translation and MetricsMATR

pdf
Chunk-Based EBMT
Jae Dong Kim | Ralf Brown | Jaime Carbonell
Proceedings of the 14th Annual conference of the European Association for Machine Translation

pdf
Automatic Determination of Number of clusters for creating Templates in Example-Based Machine Translation
Rashmi Gangadharaiah | Ralf Brown | Jaime Carbonell
Proceedings of the 14th Annual conference of the European Association for Machine Translation

pdf
Monolingual Distributional Profiles for Word Substitution in Machine Translation
Rashmi Gangadharaiah | Ralf D. Brown | Jaime Carbonell
Coling 2010: Posters

2009

pdf
Active Learning in Example-Based Machine Translation
Rashmi Gangadharaiah | Ralf D. Brown | Jaime Carbonell
Proceedings of the 17th Nordic Conference of Computational Linguistics (NODALIDA 2009)

2008

pdf bib
Exploiting Document-Level Context for Data-Driven Machine Translation
Ralf Brown
Proceedings of the 8th Conference of the Association for Machine Translation in the Americas: Research Papers

This paper presents a method for exploiting document-level similarity between the documents in the training corpus for a corpus-driven (statistical or example-based) machine translation system and the input documents it must translate. The method is simple to implement, efficient (increases the translation time of an example-based system by only a few percent), and robust (still works even when the actual document boundaries in the input text are not known). Experiments on French-English and Arabic-English showed relative gains over the same system without using document-level similarity of up to 7.4% and 5.4%, respectively, on the BLEU metric.

2007

pdf
Improving example-based machine translation through morphological generalization and adaptation
Aaron B. Phillips | Violetta Cavalli-Sforza | Ralf D. Brown
Proceedings of Machine Translation Summit XI: Papers

2006

pdf
Spectral Clustering for Example Based Machine Translation
Rashmi Gangadharaiah | Ralf Brown | Jaime Carbonell
Proceedings of the Human Language Technology Conference of the NAACL, Companion Volume: Short Papers

2005

pdf
Symmetric probabilistic alignment for example-based translation
Jae Dong Kim | Ralf D. Brown | Peter J. Jansen | Jaime G. Carbonell
Proceedings of the 10th EAMT Conference: Practical applications of machine translation

pdf bib
Context-sensitive Retrieval for Example-based Translation
Ralf Brown
Workshop on example-based machine translation

Example-Based Machine Translation (EBMT) systems have typically operated on individual sentences without taking into account prior context. By adding a simple reweighting of retrieved fragments of training examples on the basis of whether the previous translation retrieved any fragments from examples within a small window of the current instance, translation performance is improved. A further improvement is seen by performing a similar reweighting when another fragment of the current input sentence was retrieved from the same training example. Together, a simple, straightforward implementation of these two factors results in an improvement on the order of 1.0–1.6% in the BLEU metric across multiple data sets in multiple languages.

pdf
Symmetric Probabilistic Alignment
Ralf D. Brown | Jae Dong Kim | Peter J. Jansen | Jaime G. Carbonell
Proceedings of the ACL Workshop on Building and Using Parallel Texts

2004

pdf
A modified Burrows-Wheeler transform for highly scalable example-based translation
Ralf D. Brown
Proceedings of the 6th Conference of the Association for Machine Translation in the Americas: Technical Papers

The Burrows-Wheeler Transform (BWT) was originally developed for data compression, but can also be applied to indexing text. In this paper, an adaptation of the BWT to word-based indexing of the training corpus for an example-based machine translation (EBMT) system is presented. The adapted BWT embeds the necessary information to retrieve matched training instances without requiring any additional space and can be instantiated in a compressed form which reduces disk space and memory requirements by about 40% while still remaining searchable without decompression. Both the speed advantage from O(log N) lookups compared to the O(N) lookups in the inverted-file index which had previously been used and the structure of the index itself act as enablers for additional capabilities and run-time speed. Because the BWT groups all instances of any n-gram together, it can be used to quickly enumerate the most-frequent n-grams, for which translations can be precomputed and stored, resulting in an order-of-magnitude speedup at run time.

pdf
Data Collection and Analysis of Mapudungun Morphology for Spelling Correction
Christian Monson | Lori Levin | Rodolfo Vega | Ralf Brown | Ariadna Font Llitjos | Alon Lavie | Jaime Carbonell | Eliseo Cañulef | Rosendo Huisca
Proceedings of the Fourth International Conference on Language Resources and Evaluation (LREC’04)

pdf
Challenges in using an example-based MT system for a transnational digital government project
Violetta Cavalli-Sforza | Ralf D. Brown | Jaime G. Carbonell | Peter G. Jansen | Jae Dong Kim
Proceedings of the 9th EAMT Workshop: Broadening horizons of machine translation and its applications

2003

pdf
Reducing boundary friction using translation-fragment overlap
Ralf D. Brown | Rebecca Hutchinson | Paul N. Bennett | Jaime G. Carbonell | Peter Jansen
Proceedings of Machine Translation Summit IX: Papers

Many corpus-based Machine Translation (MT) systems generate a number of partial translations which are then pieced together rather than immediately producing one overall translation. While this makes them more robust to ill-formed input, they are subject to disfluencies at phrasal translation boundaries even for well-formed input. We address this “boundary friction” problem by introducing a method that exploits overlapping phrasal translations and the increased confidence in translation accuracy they imply. We specify an efficient algorithm for producing translations using overlap. Finally, our empirical analysis indicates that this approach produces higher quality translations than the standard method of combining non-overlapping fragments generated by our Example-Based MT (EBMT) system in a peak-to-peak comparison.

2002

pdf bib
Corpus-driven splitting of compound words
Ralf Brown
Proceedings of the 9th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages: Papers

pdf bib
Example-based machine translation
Ralf Brown
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Tutorial Descriptions

pdf bib
Automatic rule learning for resource-limited MT
Jaime Carbonell | Katharina Probst | Erik Peterson | Christian Monson | Alon Lavie | Ralf Brown | Lori Levin
Proceedings of the 5th Conference of the Association for Machine Translation in the Americas: Technical Papers

Machine Translation of minority languages presents unique challenges, including the paucity of bilingual training data and the unavailability of linguistically-trained speakers. This paper focuses on a machine learning approach to transfer-based MT, where data in the form of translations and lexical alignments are elicited from bilingual speakers, and a seeded version-space learning algorithm formulates and refines transfer rules. A rule-generalization lattice is defined based on LFG-style f-structures, permitting generalization operators in the search for the most general rules consistent with the elicited data. The paper presents these methods and illustrates examples.

pdf
Using Similarity Scoring to Improve the Bilingual Dictionary for Sub-sentential Alignment
Katharina Probst | Ralf Brown
Proceedings of the 40th Annual Meeting of the Association for Computational Linguistics

pdf
Speech Translation on a Tight Budget without Enough Data
Robert E. Frederking | Alan W. Black | Ralf D. Brown | Alexander Rudnicky | John Moody | Eric Steinbrecher
Proceedings of the ACL-02 Workshop on Speech-to-Speech Translation: Algorithms and Systems

pdf
Field Testing the Tongues Speech-to-Speech Machine Translation System
Robert E. Frederking | Alan W. Black | Ralf D. Brown | John Moody | Eric Steinbrecher
Proceedings of the Third International Conference on Language Resources and Evaluation (LREC’02)

2001

pdf
Pre-processing of bilingual corpora for Mandarin-English EBMT
Ying Zhang | Ralf Brown | Robert Frederking | Alon Lavie
Proceedings of Machine Translation Summit VIII

Pre-processing of bilingual corpora plays an important role in Example-Based Machine Translation (EBMT) and Statistical-Based Machine Translation (SBMT). For our Mandarin-English EBMT system, pre-processing includes segmentation for Mandarin, bracketing for English and building a statistical dictionary from the corpora. We used the Mandarin segmenter from the Linguistic Data Consortium (LDC). It uses dynamic programming with a frequency dictionary to segment the text. Although the frequency dictionary is large, it does not completely cover the corpora. In this paper, we describe the work we have done to improve the segmentation for Mandarin and the bracketing process for English to increase the length of English phrases. A statistical dictionary is built from the aligned bilingual corpus. It is used as feedback to segmentation and bracketing to re-segment / re-bracket the corpus. The process iterates several times to achieve better results. The final results of the corpus pre-processing are a segmented/bracketed aligned bilingual corpus and a statistical dictionary. We achieved positive results by increasing the average length of Chinese terms about 60% and 10% for English. The statistical dictionary gained about a 30% increase in coverage.

pdf bib
Transfer-rule induction for example-based translation
Ralf D. Brown
Workshop on Example-Based machine Translation

pdf
Design and implementation of controlled elicitation for machine translation of low-density languages
Katharina Probst | Ralf Brown | Jaime Carbonell | Alon Lavie | Lori Levin | Erik Peterson
Workshop on MT2010: Towards a Road Map for MT

NICE is a machine translation project for low-density languages. We are building a tool that will elicit a controlled corpus from a bilingual speaker who is not an expert in linguistics. The corpus is intended to cover major typological phenomena, as it is designed to work for any language. Using implicational universals, we strive to minimize the number of sentences that each informant has to translate. From the elicited sentences, we learn transfer rules with a version space algorithm. Our vision for MT in the future is one in which systems can be quickly trained for new languages by native speakers, so that speakers of minor languages can participate in education, health care, government, and internet without having to give up their languages.

pdf bib
Adapting an Example-Based Translation System to Chinese
Ying Zhang | Ralf D. Brown | Robert E. Frederking
Proceedings of the First International Conference on Human Language Technology Research

pdf
A Server for Real-Time Event Tracking in News
Ralf D. Brown
Proceedings of the First International Conference on Human Language Technology Research

2000

pdf
Automated Generalization of Translation Examples
Ralf D. Brown
COLING 2000 Volume 1: The 18th International Conference on Computational Linguistics

1999

pdf bib
Adding linguistic knowledge to a lexical example-based translation system
Ralf D. Brown
Proceedings of the 8th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

1997

pdf
Automated dictionary extraction for “knowledge-free” example-based translation
Ralf D. Brown
Proceedings of the 7th Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

pdf
The DIPLOMAT Rapid Development Speech MT System
Robert E. Frederking | Ralf D. Brown | Christopher Hogan
Proceedings of Machine Translation Summit VI: Systems

1996

pdf
Example-Based Machine Translation in the Pangloss System
Ralf D. Brown
COLING 1996 Volume 1: The 16th International Conference on Computational Linguistics

pdf
The Pangloss-Lite machine translation system
Robert E. Frederking | Ralf D. Brown
Conference of the Association for Machine Translation in the Americas

1995

pdf
Applying Statistical English Language Modelling to Symbolic Machine Translation
Ralf Brown | Robert Frederking
Proceedings of the Sixth Conference on Theoretical and Methodological Issues in Machine Translation of Natural Languages

1994

pdf
Integrating Translations from Multiple Sources within the PANGLOSS Mark III Machine Translation System
Robert Frederking | Sergei Nirenburg | David Farwell | Steven Helmreich | Eduard Hovy | Kevin Knight | Stephen Beale | Constantino Domashnev | Donalee Attardo | Dean Grannes | Ralf Brown
Proceedings of the First Conference of the Association for Machine Translation in the Americas

1990

pdf
Human-Computer Interaction for Semantic Disambiguation
Ralf D. Brown
COLING 1990 Volume 3: Papers presented to the 13th International Conference on Computational Linguistics

1988

pdf
Anaphora Resolution: A Multi-Strategy Approach
Jaime G. Carbonell | Ralf D. Brown
Coling Budapest 1988 Volume 1: International Conference on Computational Linguistics