Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition
Tao Ge, Qing Dou, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, Ming Zhou
Abstract
This paper proposes to study fine-grained coordinated cross-lingual text stream alignment through a novel information network decipherment paradigm. We use Burst Information Networks as media to represent text streams and present a simple yet effective network decipherment algorithm with diverse clues to decipher the networks for accurate text stream alignment. Experiments on Chinese-English news streams show our approach not only outperforms previous approaches on bilingual lexicon extraction from coordinated text streams but also can harvest high-quality alignments from large amounts of streaming data for endless language knowledge mining, which makes it promising to be a new paradigm for automatic language knowledge acquisition.- Anthology ID:
- D18-1271
- Volume:
- Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing
- Month:
- October-November
- Year:
- 2018
- Address:
- Brussels, Belgium
- Editors:
- Ellen Riloff, David Chiang, Julia Hockenmaier, Jun’ichi Tsujii
- Venue:
- EMNLP
- SIG:
- SIGDAT
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2496–2506
- Language:
- URL:
- https://aclanthology.org/D18-1271
- DOI:
- 10.18653/v1/D18-1271
- Cite (ACL):
- Tao Ge, Qing Dou, Heng Ji, Lei Cui, Baobao Chang, Zhifang Sui, Furu Wei, and Ming Zhou. 2018. Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 2496–2506, Brussels, Belgium. Association for Computational Linguistics.
- Cite (Informal):
- Fine-grained Coordinated Cross-lingual Text Stream Alignment for Endless Language Knowledge Acquisition (Ge et al., EMNLP 2018)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-5/D18-1271.pdf