Monolingual Phrase Alignment on Parse Forests

Yuki Arase, Junichi Tsujii


Abstract
We propose an efficient method to conduct phrase alignment on parse forests for paraphrase detection. Unlike previous studies, our method identifies syntactic paraphrases under linguistically motivated grammar. In addition, it allows phrases to non-compositionally align to handle paraphrases with non-homographic phrase correspondences. A dataset that provides gold parse trees and their phrase alignments is created. The experimental results confirm that the proposed method conducts highly accurate phrase alignment compared to human performance.
Anthology ID:
D17-1001
Volume:
Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing
Month:
September
Year:
2017
Address:
Copenhagen, Denmark
Venue:
EMNLP
SIG:
SIGDAT
Publisher:
Association for Computational Linguistics
Note:
Pages:
1–11
Language:
URL:
https://aclanthology.org/D17-1001
DOI:
10.18653/v1/D17-1001
Bibkey:
Cite (ACL):
Yuki Arase and Junichi Tsujii. 2017. Monolingual Phrase Alignment on Parse Forests. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1–11, Copenhagen, Denmark. Association for Computational Linguistics.
Cite (Informal):
Monolingual Phrase Alignment on Parse Forests (Arase & Tsujii, EMNLP 2017)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingestion-script-update/D17-1001.pdf
Attachment:
 D17-1001.Attachment.zip
Video:
 https://vimeo.com/238234373