Abstract
We present a simple, easy-to-replicate monolingual aligner that demonstrates state-of-the-art performance while relying on almost no supervision and a very small number of external resources. Based on the hypothesis that words with similar meanings represent potential pairs for alignment if located in similar contexts, we propose a system that operates by finding such pairs. In two intrinsic evaluations on alignment test data, our system achieves F1 scores of 88–92%, demonstrating 1–3% absolute improvement over the previous best system. Moreover, in two extrinsic evaluations our aligner outperforms existing aligners, and even a naive application of the aligner approaches state-of-the-art performance in each extrinsic task.- Anthology ID:
- Q14-1018
- Volume:
- Transactions of the Association for Computational Linguistics, Volume 2
- Month:
- Year:
- 2014
- Address:
- Cambridge, MA
- Venue:
- TACL
- SIG:
- Publisher:
- MIT Press
- Note:
- Pages:
- 219–230
- Language:
- URL:
- https://aclanthology.org/Q14-1018
- DOI:
- 10.1162/tacl_a_00178
- Cite (ACL):
- Md Arafat Sultan, Steven Bethard, and Tamara Sumner. 2014. Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence. Transactions of the Association for Computational Linguistics, 2:219–230.
- Cite (Informal):
- Back to Basics for Monolingual Alignment: Exploiting Word Similarity and Contextual Evidence (Sultan et al., TACL 2014)
- PDF:
- https://preview.aclanthology.org/paclic-22-ingestion/Q14-1018.pdf