Abstract
Paraphrases extracted from parallel corpora by the pivot method (Bannard and Callison-Burch, 2005) constitute a valuable resource for multilingual NLP applications. In this study, we analyse the semantics of unigram pivot paraphrases and use a graph-based sense induction approach to unveil hidden sense distinctions in the paraphrase sets. The comparison of the acquired senses to gold data from the Lexical Substitution shared task (McCarthy and Navigli, 2007) demonstrates that sense distinctions exist in the paraphrase sets and highlights the need for a disambiguation step in applications using this resource.- Anthology ID:
- L14-1401
- Volume:
- Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14)
- Month:
- May
- Year:
- 2014
- Address:
- Reykjavik, Iceland
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Hrafn Loftsson, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 4270–4275
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/475_Paper.pdf
- DOI:
- Cite (ACL):
- Marianna Apidianaki, Emilia Verzeni, and Diana McCarthy. 2014. Semantic Clustering of Pivot Paraphrases. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14), pages 4270–4275, Reykjavik, Iceland. European Language Resources Association (ELRA).
- Cite (Informal):
- Semantic Clustering of Pivot Paraphrases (Apidianaki et al., LREC 2014)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2014/pdf/475_Paper.pdf