Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages

Amrith Krishna, Vishnu Sharma, Bishal Santra, Aishik Chakraborty, Pavankumar Satuluri, Pawan Goyal


Abstract
The word ordering in a Sanskrit verse is often not aligned with its corresponding prose order. Conversion of the verse to its corresponding prose helps in better comprehension of the construction. Owing to the resource constraints, we formulate this task as a word ordering (linearisation) task. In doing so, we completely ignore the word arrangement at the verse side. kāvya guru, the approach we propose, essentially consists of a pipeline of two pretraining steps followed by a seq2seq model. The first pretraining step learns task-specific token embeddings from pretrained embeddings. In the next step, we generate multiple possible hypotheses for possible word arrangements of the input %using another pretraining step. We then use them as inputs to a neural seq2seq model for the final prediction. We empirically show that the hypotheses generated by our pretraining step result in predictions that consistently outperform predictions based on the original order in the verse. Overall, kāvya guru outperforms current state of the art models in linearisation for the poetry to prose conversion task in Sanskrit.
Anthology ID:
P19-1111
Volume:
Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics
Month:
July
Year:
2019
Address:
Florence, Italy
Editors:
Anna Korhonen, David Traum, Lluís Màrquez
Venue:
ACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
1160–1166
Language:
URL:
https://aclanthology.org/P19-1111
DOI:
10.18653/v1/P19-1111
Bibkey:
Cite (ACL):
Amrith Krishna, Vishnu Sharma, Bishal Santra, Aishik Chakraborty, Pavankumar Satuluri, and Pawan Goyal. 2019. Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, pages 1160–1166, Florence, Italy. Association for Computational Linguistics.
Cite (Informal):
Poetry to Prose Conversion in Sanskrit as a Linearisation Task: A Case for Low-Resource Languages (Krishna et al., ACL 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/ingest-bitext-workshop/P19-1111.pdf