Abstract
Recognizing similar or close meaning on different surface form is a common challenge in various Natural Language Processing and Information Access applications. However, we identified multiple limitations in existing resources that can be used for solving the vocabulary mismatch problem. To this end, we will propose the Diversifiable Bootstrapping algorithm that can learn paraphrase patterns with a high lexical coverage. The algorithm works in a lightly-supervised iterative fashion, where instance and pattern acquisition are interleaved, each using information provided by the other. By tweaking a parameter in the algorithm, resulting patterns can be diversifiable with a specific degree one can control.- Anthology ID:
- L12-1557
- Volume:
- Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
- Month:
- May
- Year:
- 2012
- Address:
- Istanbul, Turkey
- Editors:
- Nicoletta Calzolari, Khalid Choukri, Thierry Declerck, Mehmet Uğur Doğan, Bente Maegaard, Joseph Mariani, Asuncion Moreno, Jan Odijk, Stelios Piperidis
- Venue:
- LREC
- SIG:
- Publisher:
- European Language Resources Association (ELRA)
- Note:
- Pages:
- 2666–2673
- Language:
- URL:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/934_Paper.pdf
- DOI:
- Cite (ACL):
- Hideki Shima and Teruko Mitamura. 2012. Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 2666–2673, Istanbul, Turkey. European Language Resources Association (ELRA).
- Cite (Informal):
- Diversifiable Bootstrapping for Acquiring High-Coverage Paraphrase Resource (Shima & Mitamura, LREC 2012)
- PDF:
- http://www.lrec-conf.org/proceedings/lrec2012/pdf/934_Paper.pdf