Stan Szpakowicz

Also published as: Stanisław Szpakowicz, Stanislaw Szpakowicz

Book recommender systems can help promote the practice of reading for pleasure, which has been declining in recent years. One factor that influences reading preferences is writing style. We propose a system that recommends books after learning their authors’ style. To our knowledge, this is the first work that applies the information learned by an author-identification model to book recommendations. We evaluated the system according to a top-k recommendation scenario. Our system gives better accuracy when compared with many state-of-the-art methods. We also conducted a qualitative analysis by checking if similar books/authors were annotated similarly by experts.

pdf bib

Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Beatrice Alex | Stefania Degaetano-Ortlieb | Anna Feldman | Anna Kazantseva | Nils Reiter | Stan Szpakowicz
Proceedings of the Second Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

2017

pdf bib

Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature
Beatrice Alex | Stefania Degaetano-Ortlieb | Anna Feldman | Anna Kazantseva | Nils Reiter | Stan Szpakowicz
Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

pdf bib abs

Metaphor Detection in a Poetry Corpus
Vaibhav Kesarwani | Diana Inkpen | Stan Szpakowicz | Chris Tanasescu
Proceedings of the Joint SIGHUM Workshop on Computational Linguistics for Cultural Heritage, Social Sciences, Humanities and Literature

Metaphor is indispensable in poetry. It showcases the poet’s creativity, and contributes to the overall emotional pertinence of the poem while honing its specific rhetorical impact. Previous work on metaphor detection relies on either rule-based or statistical models, none of them applied to poetry. Our method focuses on metaphor detection in a poetry corpus. It combines rule-based and statistical models (word embeddings) to develop a new classification system. Our system has achieved a precision of 0.759 and a recall of 0.804 in identifying one type of metaphor in poetry.

2016

pdf bib abs

Adverbs in plWordNet: Theory and Implementation
Marek Maziarz | Stan Szpakowicz | Michal Kalinski
Proceedings of the 8th Global WordNet Conference (GWC)

Adverbs are seldom well represented in wordnets. Princeton WordNet, for example, derives from adjectives practically all its adverbs and whatever involvement they have. GermaNet stays away from this part of speech. Adverbs in plWordNet will be emphatically present in all their semantic and syntactic distinctness. We briefly discuss the linguistic background of the lexical system of Polish adverbs. We describe an automated generator of accurate candidate adverbs, and introduce the lexicographic procedures which will ensure high consistency of wordnet editors’ decisions about adverbs.

pdf bib abs

plWordNet 3.0 – Almost There
Maciej Piasecki | Stan Szpakowicz | Marek Maziarz | Ewa Rudnicka
Proceedings of the 8th Global WordNet Conference (GWC)

It took us nearly ten years to get from no wordnet for Polish to the largest wordnet ever built. We started small but quickly learned to dream big. Now we are about to release plWordNet 3.0-emo – complete with sentiment and emotions annotated – and a domestic version of Princeton WordNet, larger than WordNet 3.1 by nearly ten thousand newly added words. The paper retraces the road we travelled and talks a little about the future.

pdf bib abs

plWordNet 3.0 – a Comprehensive Lexical-Semantic Resource
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz | Paweł Kędzia
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers

We have released plWordNet 3.0, a very large wordnet for Polish. In addition to what is expected in wordnets – richly interrelated synsets – it contains sentiment and emotion annotations, a large set of multi-word expressions, and a mapping onto WordNet 3.1. Part of the release is enWordNet 1.0, a substantially enlarged copy of WordNet 3.1, with material added to allow for a more complete mapping. The paper discusses the design principles of plWordNet, its content, its statistical portrait, a comparison with similar resources, and a partial list of applications.

pdf bib

Proceedings of the Fifth Workshop on Computational Linguistics for Literature
Anna Feldman | Anna Kazantseva | Stan Szpakowicz
Proceedings of the Fifth Workshop on Computational Linguistics for Literature

2015

pdf bib

Literature Lifts Up Computational Linguistics
David K. Elson | Anna Feldman | Anna Kazantseva | Stan Szpakowicz
Linguistic Issues in Language Technology, Volume 12, 2015 - Literature Lifts up Computational Linguistics

bib abs

Learning Semantic Relations from Text
Preslav Nakov | Vivi Nastase | Diarmuid Ó Séaghdha | Stan Szpakowicz
Proceedings of the 2015 Conference on Empirical Methods in Natural Language Processing: Tutorial Abstracts

Every non-trivial text describes interactions and relations between people, institutions, activities, events and so on. What we know about the world consists in large part of such relations, and that knowledge contributes to the understanding of what texts refer to. Newly found relations can in turn become part of this knowledge that is stored for future use.To grasp a text’s semantic content, an automatic system must be able to recognize relations in texts and reason about them. This may be done by applying and updating previously acquired knowledge. We focus here in particular on semantic relations which describe the interactions among nouns and compact noun phrases, and we present such relations from both a theoretical and a practical perspective. The theoretical exploration sketches the historical path which has brought us to the contemporary view and interpretation of semantic relations. We discuss a wide range of relation inventories proposed by linguists and by language processing people. Such inventories vary by domain, granularity and suitability for downstream applications.On the practical side, we investigate the recognition and acquisition of relations from texts. In a look at supervised learning methods, we present available datasets, the variety of features which can describe relation instances, and learning algorithms found appropriate for the task. Next, we present weakly supervised and unsupervised learning methods of acquiring relations from large corpora with little or no previously annotated data. We show how enduring the bootstrapping algorithm based on seed examples or patterns has proved to be, and how it has been adapted to tackle Web-scale text collections. We also show a few machine learning techniques which can perform fast and reliable relation extraction by taking advantage of data redundancy and variability.

pdf bib

A Procedural Definition of Multi-word Lexical Units
Marek Maziarz | Stan Szpakowicz | Maciej Piasecki
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib

A Large Wordnet-based Sentiment Lexicon for Polish
Monika Zaśko-Zielińska | Maciej Piasecki | Stan Szpakowicz
Proceedings of the International Conference Recent Advances in Natural Language Processing

pdf bib

Proceedings of the Fourth Workshop on Computational Linguistics for Literature
Anna Feldman | Anna Kazantseva | Stan Szpakowicz | Corina Koolen
Proceedings of the Fourth Workshop on Computational Linguistics for Literature

2014

pdf bib

Hierarchical Topical Segmentation with Affinity Propagation
Anna Kazantseva | Stan Szpakowicz
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib

Measuring Lexical Cohesion: Beyond Word Repetition
Anna Kazantseva | Stan Szpakowicz
Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers

pdf bib

Terminology in WordNet and in plWordNet
Marta Dobrowolska | Stan Szpakowicz
Proceedings of the Seventh Global Wordnet Conference

pdf bib

plWordNet as the Cornerstone of a Toolkit of Lexico-semantic Resources
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the Seventh Global Wordnet Conference

pdf bib

Registers in the System of Semantic Relations in plWordNet
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the Seventh Global Wordnet Conference

pdf bib

Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL)
Anna Feldman | Anna Kazantseva | Stan Szpakowicz
Proceedings of the 3rd Workshop on Computational Linguistics for Literature (CLFL)

2013

pdf bib

Beyond the Transfer-and-Merge Wordnet Construction: plWordNet and a Comparison with WordNet
Marek Maziarz | Maciej Piasecki | Ewa Rudnicka | Stan Szpakowicz
Proceedings of the International Conference Recent Advances in Natural Language Processing RANLP 2013

pdf bib

SemEval-2013 Task 4: Free Paraphrases of Noun Compounds
Iris Hendrickx | Zornitsa Kozareva | Preslav Nakov | Diarmuid Ó Séaghdha | Stan Szpakowicz | Tony Veale
Second Joint Conference on Lexical and Computational Semantics (*SEM), Volume 2: Proceedings of the Seventh International Workshop on Semantic Evaluation (SemEval 2013)

pdf bib

Proceedings of the Workshop on Computational Linguistics for Literature
David Elson | Anna Kazantseva | Stan Szpakowicz
Proceedings of the Workshop on Computational Linguistics for Literature

2012

pdf bib

A Strategy of Mapping Polish WordNet onto Princeton WordNet
Ewa Rudnicka | Marek Maziarz | Maciej Piasecki | Stan Szpakowicz
Proceedings of COLING 2012: Posters

pdf bib

Topical Segmentation: a Study of Human Performance and a New Measure of Quality.
Anna Kazantseva | Stan Szpakowicz
Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies

pdf bib

Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature
David Elson | Anna Kazantseva | Rada Mihalcea | Stan Szpakowicz
Proceedings of the NAACL-HLT 2012 Workshop on Computational Linguistics for Literature

pdf bib

Prior versus Contextual Emotion of a Word in a Sentence
Diman Ghazi | Diana Inkpen | Stan Szpakowicz
Proceedings of the 3rd Workshop in Computational Approaches to Subjectivity and Sentiment Analysis

2011

pdf bib

Linear Text Segmentation Using Affinity Propagation
Anna Kazantseva | Stan Szpakowicz
Proceedings of the 2011 Conference on Empirical Methods in Natural Language Processing

pdf bib

2010

pdf bib

Summarizing Short Stories
Anna Kazantseva | Stan Szpakowicz
Computational Linguistics, Volume 36, Number 1, March 2010

pdf bib

Last Words: Failure is an Orphan (Let’s Adopt)
Stan Szpakowicz
Computational Linguistics, Volume 36, Number 1, March 2010

pdf bib

pdf bib

SemEval-2 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions
Cristina Butnariu | Su Nam Kim | Preslav Nakov | Diarmuid Ó Séaghdha | Stan Szpakowicz | Tony Veale
Proceedings of the 5th International Workshop on Semantic Evaluation

pdf bib

Hierarchical versus Flat Classification of Emotions in Text
Diman Ghazi | Diana Inkpen | Stan Szpakowicz
Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text

2009

pdf bib

pdf bib

SemEval-2010 Task 9: The Interpretation of Noun Compounds Using Paraphrasing Verbs and Prepositions
Cristina Butnariu | Su Nam Kim | Preslav Nakov | Diarmuid Ó Séaghdha | Stan Szpakowicz | Tony Veale
Proceedings of the Workshop on Semantic Evaluations: Recent Achievements and Future Directions (SEW-2009)

2008

pdf bib

The Telling Tail: Signals of Success in Electronic Negotiation Texts
Marina Sokolova | Vivi Nastase | Stan Szpakowicz
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib

Using Roget’s Thesaurus for Fine-grained Emotion Recognition
Saima Aman | Stan Szpakowicz
Proceedings of the Third International Joint Conference on Natural Language Processing: Volume-I

pdf bib abs

Corpus-based Semantic Relatedness for the Construction of Polish WordNet
Bartosz Broda | Magdalena Derwojedowa | Maciej Piasecki | Stanislaw Szpakowicz
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

The construction of a wordnet, a labour-intensive enterprise, can be significantly assisted by automatic grouping of lexical material and discovery of lexical semantic relations. The objective is to ensure high quality of automatically acquired results before they are presented for lexicographers approval. We discuss a software tool that suggests synset members using a measure of semantic relatedness with a given verb or adjective; this extends previous work on nominal synsets in Polish WordNet. Syntactically-motivated constraints are deployed on a large morphologically annotated corpus of Polish. Evaluation has been performed via the WordNet-Based Similarity Test and additionally supported by human raters. A lexicographer also manually assessed a suitable sample of suggestions. The results compare favourably with other known methods of acquiring semantic relations.

pdf bib abs

Using the Web as a Linguistic Resource to Automatically Correct Lexico-Syntactic Errors
Matthieu Hermet | Alain Désilets | Stan Szpakowicz
Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC'08)

This paper presents an algorithm for correcting language errors typical of second-language learners. We focus on preposition errors, which are very common among second-language learners but are not addressed well by current commercial grammar correctors and editing aids. The algorithm takes as input a sentence containing a preposition error (and possibly other errors as well), and outputs the correct preposition for that particular sentence context. We use a two-phase hybrid rule-based and statistical approach. In the first phase, rule-based processing is used to generate a short expression that captures the context of use of the preposition in the input sentence. In the second phase, Web searches are used to evaluate the frequency of this expression, when alternative prepositions are used instead of the original one. We tested this algorithm on a corpus of 133 French sentences written by intermediate second-language learners, and found that it could address 69.9% of those cases. In contrast, we found that the best French grammar and spell checker currently on the market, Antidote, addressed only 3% of those cases. We also showed that performance degrades gracefully when using a corpus of frequent n-grams to evaluate frequencies.

pdf bib

Evaluating Roget‘s Thesauri
Alistair Kennedy | Stan Szpakowicz
Proceedings of ACL-08: HLT