2016
pdf
abs
Introducing the LCC Metaphor Datasets
Michael Mohler
|
Mary Brunson
|
Bryan Rink
|
Marc Tomlinson
Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC'16)
In this work, we present the Language Computer Corporation (LCC) annotated metaphor datasets, which represent the largest and most comprehensive resource for metaphor research to date. These datasets were produced over the course of three years by a staff of nine annotators working in four languages (English, Spanish, Russian, and Farsi). As part of these datasets, we provide (1) metaphoricity ratings for within-sentence word pairs on a four-point scale, (2) scored links to our repository of 114 source concept domains and 32 target concept domains, and (3) ratings for the affective polarity and intensity of each pair. Altogether, we provide 188,741 annotations in English (for 80,100 pairs), 159,915 annotations in Spanish (for 63,188 pairs), 99,740 annotations in Russian (for 44,632 pairs), and 137,186 annotations in Farsi (for 57,239 pairs). In addition, we are providing a large set of likely metaphors which have been independently extracted by our two state-of-the-art metaphor detection systems but which have not been analyzed by our team of annotators.
2015
pdf
A Corpus of Rich Metaphor Annotation
Jonathan Gordon
|
Jerry Hobbs
|
Jonathan May
|
Michael Mohler
|
Fabrizio Morbini
|
Bryan Rink
|
Marc Tomlinson
|
Suzanne Wertheim
Proceedings of the Third Workshop on Metaphor in NLP
2014
pdf
A Novel Distributional Approach to Multilingual Conceptual Metaphor Recognition
Michael Mohler
|
Bryan Rink
|
David Bracewell
|
Marc Tomlinson
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers
pdf
abs
Semi-supervised methods for expanding psycholinguistics norms by integrating distributional similarity with the structure of WordNet
Michael Mohler
|
Marc Tomlinson
|
David Bracewell
|
Bryan Rink
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
In this work, we present two complementary methods for the expansion of psycholinguistics norms. The first method is a random-traversal spreading activation approach which transfers existing norms onto semantically related terms using notions of synonymy, hypernymy, and pertainymy to approach full coverage of the English language. The second method makes use of recent advances in distributional similarity representation to transfer existing norms to their closest neighbors in a high-dimensional vector space. These two methods (along with a naive hybrid approach combining the two) have been shown to significantly outperform a state-of-the-art resource expansion system at our pilot task of imageability expansion. We have evaluated these systems in a cross-validation experiment using 8,188 norms found in existing pscholinguistics literature. We have also validated the quality of these combined norms by performing a small study using Amazon Mechanical Turk (AMT).
2013
pdf
Semantic Signatures for Example-Based Linguistic Metaphor Detection
Michael Mohler
|
David Bracewell
|
Marc Tomlinson
|
David Hinote
Proceedings of the First Workshop on Metaphor in NLP
pdf
CPN-CORE: A Text Semantic Similarity System Infused with Opinion Knowledge
Carmen Banea
|
Yoonjung Choi
|
Lingjia Deng
|
Samer Hassan
|
Michael Mohler
|
Bishan Yang
|
Claire Cardie
|
Rada Mihalcea
|
Jan Wiebe
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 1: Proceedings of the Main Conference and the Shared Task: Semantic Textual Similarity
2012
pdf
UNT: A Supervised Synergistic Approach to Semantic Text Similarity
Carmen Banea
|
Samer Hassan
|
Michael Mohler
|
Rada Mihalcea
*SEM 2012: The First Joint Conference on Lexical and Computational Semantics – Volume 1: Proceedings of the main conference and the shared task, and Volume 2: Proceedings of the Sixth International Workshop on Semantic Evaluation (SemEval 2012)
2011
pdf
Learning to Grade Short Answer Questions using Semantic Similarity Measures and Dependency Graph Alignments
Michael Mohler
|
Razvan Bunescu
|
Rada Mihalcea
Proceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies
2009
pdf
Text-to-Text Semantic Similarity for Automatic Short Answer Grading
Michael Mohler
|
Rada Mihalcea
Proceedings of the 12th Conference of the European Chapter of the ACL (EACL 2009)
2008
pdf
abs
Babylon Parallel Text Builder: Gathering Parallel Texts for Low-Density Languages
Michael Mohler
|
Rada Mihalcea
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)
This paper describes Babylon, a system that attempts to overcome the shortage of parallel texts in low-density languages by supplementing existing parallel texts with texts gathered automatically from the Web. In addition to the identification of entire Web pages, we also propose a new feature specifically designed to find parallel text chunks within a single document. Experiments carried out on the Quechua-Spanish language pair show that the system is successful in automatically identifying a significant amount of parallel texts on the Web. Evaluations of a machine translation system trained on this corpus indicate that the Web-gathered parallel texts can supplement manually compiled parallel texts and perform significantly better than the manually compiled texts when tested on other Web-gathered data.