Abstract
Paraphrase identification involves identifying whether a pair of sentences express the same or similar meanings. While cross-encoders have achieved high performances across several benchmarks, bi-encoders such as SBERT have been widely applied to sentence pair tasks. They exhibit substantially lower computation complexity and are better suited to symmetric tasks. In this work, we adopt a bi-encoder approach to the paraphrase identification task, and investigate the impact of explicitly incorporating predicate-argument information into SBERT through weighted aggregation. Experiments on six paraphrase identification datasets demonstrate that, with a minimal increase in parameters, the proposed model is able to outperform SBERT/SRoBERTa significantly. Further, ablation studies reveal that the predicate-argument based component plays a significant role in the performance gain.- Anthology ID:
- 2022.acl-long.382
- Volume:
- Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)
- Month:
- May
- Year:
- 2022
- Address:
- Dublin, Ireland
- Venue:
- ACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 5579–5589
- Language:
- URL:
- https://aclanthology.org/2022.acl-long.382
- DOI:
- 10.18653/v1/2022.acl-long.382
- Cite (ACL):
- Qiwei Peng, David Weir, Julie Weeds, and Yekun Chai. 2022. Predicate-Argument Based Bi-Encoder for Paraphrase Identification. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 5579–5589, Dublin, Ireland. Association for Computational Linguistics.
- Cite (Informal):
- Predicate-Argument Based Bi-Encoder for Paraphrase Identification (Peng et al., ACL 2022)
- PDF:
- https://preview.aclanthology.org/ingestion-script-update/2022.acl-long.382.pdf
- Data
- GLUE, PIT