Measuring the Limit of Semantic Divergence for English Tweets.

Dwijen Rudrapal, Amitava Das


Abstract
In human language, an expression could be conveyed in many ways by different people. Even that the same person may express same sentence quite differently when addressing different audiences, using different modalities, or using different syntactic variations or may use different set of vocabulary. The possibility of such endless surface form of text while the meaning of the text remains almost same, poses many challenges for Natural Language Processing (NLP) systems like question-answering system, machine translation system and text summarization. This research paper is an endeavor to understand the characteristic of such endless semantic divergence. In this research work we develop a corpus of 1525 semantic divergent sentences for 200 English tweets.
Anthology ID:
R17-1080
Volume:
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
618–624
Language:
URL:
https://doi.org/10.26615/978-954-452-049-6_080
DOI:
10.26615/978-954-452-049-6_080
Bibkey:
Cite (ACL):
Dwijen Rudrapal and Amitava Das. 2017. Measuring the Limit of Semantic Divergence for English Tweets.. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 618–624, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Measuring the Limit of Semantic Divergence for English Tweets. (Rudrapal & Das, RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-049-6_080