Word Segmentation in Sanskrit Using Path Constrained Random Walks

Amrith Krishna, Bishal Santra, Pavankumar Satuluri, Sasi Prasanth Bandaru, Bhumi Faldu, Yajuvendra Singh, Pawan Goyal


Abstract
In Sanskrit, the phonemes at the word boundaries undergo changes to form new phonemes through a process called as sandhi. A fused sentence can be segmented into multiple possible segmentations. We propose a word segmentation approach that predicts the most semantically valid segmentation for a given sentence. We treat the problem as a query expansion problem and use the path-constrained random walks framework to predict the correct segments.
Anthology ID:
C16-1048
Volume:
Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers
Month:
December
Year:
2016
Address:
Osaka, Japan
Editors:
Yuji Matsumoto, Rashmi Prasad
Venue:
COLING
SIG:
Publisher:
The COLING 2016 Organizing Committee
Note:
Pages:
494–504
Language:
URL:
https://aclanthology.org/C16-1048
DOI:
Bibkey:
Cite (ACL):
Amrith Krishna, Bishal Santra, Pavankumar Satuluri, Sasi Prasanth Bandaru, Bhumi Faldu, Yajuvendra Singh, and Pawan Goyal. 2016. Word Segmentation in Sanskrit Using Path Constrained Random Walks. In Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics: Technical Papers, pages 494–504, Osaka, Japan. The COLING 2016 Organizing Committee.
Cite (Informal):
Word Segmentation in Sanskrit Using Path Constrained Random Walks (Krishna et al., COLING 2016)
Copy Citation:
PDF:
https://preview.aclanthology.org/nschneid-patch-3/C16-1048.pdf