This is an internal, incomplete preview of a proposed change to the ACL Anthology.
For efficiency reasons, we don't generate MODS or Endnote formats, and the preview may be incomplete in other ways, or contain mistakes.
Do not treat this content as an official publication.
MichelleVanni
Fixing paper assignments
Please select all papers that belong to the same person.
Indicate below which author they should be assigned to.
Entity set expansion and synonym discovery are two critical NLP tasks. Previous studies accomplish them separately, without exploring their interdependencies. In this work, we hypothesize that these two tasks are tightly coupled because two synonymous entities tend to have a similar likelihood of belonging to various semantic classes. This motivates us to design SynSetExpan, a novel framework that enables two tasks to mutually enhance each other. SynSetExpan uses a synonym discovery model to include popular entities’ infrequent synonyms into the set, which boosts the set expansion recall. Meanwhile, the set expansion model, being able to determine whether an entity belongs to a semantic class, can generate pseudo training data to fine-tune the synonym discovery model towards better accuracy. To facilitate the research on studying the interplays of these two tasks, we create the first large-scale Synonym-Enhanced Set Expansion (SE2) dataset via crowdsourcing. Extensive experiments on the SE2 dataset and previous benchmarks demonstrate the effectiveness of SynSetExpan for both entity set expansion and synonym discovery tasks.
Recently, there has been an emphasis on creating shared resources for natural language processing applications. This has resulted in the development of high-quality tools and data, which can then be leveraged by the research community as components for novel systems. In this paper, we reuse an open source machine translation framework to create an Arabic-to-English entity translation system. The system first translates known entity mentions using a standard phrase-based statistical machine translation framework, which is then reused to perform name transliteration on unknown mentions. In order to transliterate names more accurately, we introduce an algorithm to augment a names database with name origin and frequency information from existing data resources. Origin information is used to learn name origin classifiers and origin-specific transliteration models, while frequency information is used to select amongst n-best transliteration candidates. This work demonstrates the feasibility and benefit of adapting such data resources and shows how off-the-shelf tools and data resources can be repurposed to rapidly create a system outside their original domain.
Tasks performed on machine translation (MT) output are associated with input text types such as genre and topic. Predictive Linguistic Assessments of Translation Output, or PLATO, MT Evaluation (MTE) explores a predictive relationship between linguistic metrics and the information processing tasks reliably performable on output. PLATO assigns a linguistic signature, which cuts across the task-based and automated metric paradigms. Here we report on PLATO assessments of clarity, coherence, morphology, syntax, lexical robustness, name-rendering, and terminology in a comparison of Arabic MT engines in which register differentiates the input. With a team of 10 assessors employing eight linguistic tests, we analyzed the results of five systems processing of 10 input texts from two distinct linguistic registers: a total we analyzed 800 data sets. The analysis pointed to specific areas, such as general lexical robustness, where system performance was comparable on both types of input. Divergent performance, however, was observed on clarity and name-rendering assessments. These results suggest that, while systems may be considered reliable regardless of input register for the lexicon-dependent triage task, register may have an affect on the suitability of MT systems output for relevance judgment and information extraction tasks, which rely on clearness and proper named-entity rendering. Further, we show that the evaluation metrics incorporated in PLATO differentiate between MT systems performance on a text type for which they are presumably optimized and one on which they are not.
The DARPA MT evaluations of the early 1990s, along with subsequent work on the MT Scale, and the International Standards for Language Engineering (ISLE) MT Evaluation framework represent two of the principal efforts in Machine Translation Evaluation (MTE) over the past decade. We describe a research program that builds on both of these efforts. This paper focuses on the selection of MT output features suggested in the ISLE framework, as well as the development of metrics for the features to be used in the study. We define each metric and describe the rationale for its development. We also discuss several of the finer points of the evaluation measures that arose as a result of verification of the measures against sample output texts from three machine translation systems.
Work on comparing a set of linguistic test scores for MT output to a set of the same tests’ scores for naturally-occurring target language text (Jones and Rusk 2000) broke new ground in automating MT Evaluation. However, the tests used were selected on an ad hoc basis. In this paper, we report on work to extend our understanding, through refinement and validation, of suitable linguistic tests in the context of our novel approach to MTE. This approach was introduced in Miller and Vanni (2001a) and employs standard, rather than randomly-chosen, tests of MT output quality selected from the ISLE framework as well as a scoring system for predicting the type of information processing task performable with the output. Since the intent is to automate the scoring system, this work can also be viewed as the preliminary steps of algorithm design.
Machine Translation evaluation has been more magic and opinion than science. The history of MT evaluation is long and checkered - the search for objective, measurable, resource-reduced methods of evaluation continues. A recent trend towards task-based evaluation inspires the question - can we use methods of evaluation of language competence in language learners and apply them reasonably to MT evaluation? This paper is the first in a series of steps to look at this question. In this paper, we will present the theoretical framework for our ideas, the notions we ultimately aim towards and some very preliminary results of a small experiment along these lines.