Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource

Hideki Shima, Teruko Mitamura


Abstract
Recognizing similar or close meaning on different surface form is a common challenge in various Natural Language Processing and Information Access applications. However, we identified multiple limitations in existing resources that can be used for solving the vocabulary mismatch problem. To this end, we will propose the Diversifiable Bootstrapping algorithm that can learn paraphrase patterns with a high lexical coverage. The algorithm works in a lightly-supervised iterative fashion, where instance and pattern acquisition are interleaved, each using information provided by the other. By tweaking a parameter in the algorithm, resulting patterns can be diversifiable with a specific degree one can control.
Anthology ID:
L12-1557
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Editors:
Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
2666–2673
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/934_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
Hideki Shima and Teruko Mitamura. 2012. Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2666–2673, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource (Shima & Mitamura, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/934_Paper.pdf