Anatole Gershman


2015

pdf
Extending a Single-Document Summarizer to Multi-Document: a Hierarchical Approach
Luís Marujo | Ricardo Ribeiro | David Martins de Matos | João Neto | Anatole Gershman | Jaime Carbonell
Proceedings of the Fourth Joint Conference on Lexical and Computational Semantics

pdf
Matrix Factorization with Knowledge Graph Propagation for Unsupervised Spoken Language Understanding
Yun-Nung Chen | William Yang Wang | Anatole Gershman | Alexander Rudnicky
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)

pdf
Automatic Keyword Extraction on Twitter
Luís Marujo | Wang Ling | Isabel Trancoso | Chris Dyer | Alan W. Black | Anatole Gershman | David Martins de Matos | João Neto | Jaime Carbonell
Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (Volume 2: Short Papers)

2014

pdf
Metaphor Detection with Cross-Lingual Model Transfer
Yulia Tsvetkov | Leonid Boytsov | Anatole Gershman | Eric Nyberg | Chris Dyer
Proceedings of the 52nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)

pdf
Resources for the Detection of Conventionalized Metaphors in Four Languages
Lori Levin | Teruko Mitamura | Brian MacWhinney | Davida Fromm | Jaime Carbonell | Weston Feely | Robert Frederking | Anatole Gershman | Carlos Ramirez
Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)

This paper describes a suite of tools for extracting conventionalized metaphors in English, Spanish, Farsi, and Russian. The method depends on three significant resources for each language: a corpus of conventionalized metaphors, a table of conventionalized conceptual metaphors (CCM table), and a set of extraction rules. Conventionalized metaphors are things like “escape from poverty” and “burden of taxation”. For each metaphor, the CCM table contains the metaphorical source domain word (such as “escape”) the target domain word (such as “poverty”) and the grammatical construction in which they can be found. The extraction rules operate on the output of a dependency parser and identify the grammatical configurations (such as a verb with a prepositional phrase complement) that are likely to contain conventional metaphors. We present results on detection rates for conventional metaphors and analysis of the similarity and differences of source domains for conventional metaphors in the four languages.

2013

pdf
Cross-Lingual Metaphor Detection Using Common Semantic Features
Yulia Tsvetkov | Elena Mukomel | Anatole Gershman
Proceedings of the First Workshop on Metaphor in NLP

2012

pdf
Supervised Topical Key Phrase Extraction of News Stories using Crowdsourcing, Light Filtering and Co-reference Normalization
Luís Marujo | Anatole Gershman | Jaime Carbonell | Robert Frederking | João P. Neto
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)

Fast and effective automated indexing is critical for search and personalized services. Key phrases that consist of one or more words and represent the main concepts of the document are often used for the purpose of indexing. In this paper, we investigate the use of additional semantic features and pre-processing steps to improve automatic key phrase extraction. These features include the use of signal words and freebase categories. Some of these features lead to significant improvements in the accuracy of the results. We also experimented with 2 forms of document pre-processing that we call light filtering and co-reference normalization. Light filtering removes sentences from the document, which are judged peripheral to its main content. Co-reference normalization unifies several written forms of the same named entity into a unique form. We also needed a “Gold Standard” ― a set of labeled documents for training and evaluation. While the subjective nature of key phrase selection precludes a true “Gold Standard”, we used Amazon's Mechanical Turk service to obtain a useful approximation. Our data indicates that the biggest improvements in performance were due to shallow semantic features, news categories, and rhetorical signals (nDCG 78.47% vs. 68.93%). The inclusion of deeper semantic features such as Freebase sub-categories was not beneficial by itself, but in combination with pre-processing, did cause slight improvements in the nDCG scores.

pdf
Recognition of Named-Event Passages in News Articles
Luis Marujo | Wang Ling | Anatole Gershman | Jaime Carbonell | João P. Neto | David Matos
Proceedings of COLING 2012: Demonstration Papers

2010

pdf
CONE: Metrics for Automatic Evaluation of Named Entity Co-Reference Resolution
Bo Lin | Rushin Shah | Robert Frederking | Anatole Gershman
Proceedings of the 2010 Named Entities Workshop