Shagun Sinha


2020

pdf
Abstractive Text Summarization for Sanskrit Prose: A Study of Methods and Approaches
Shagun Sinha | Girish Jha
Proceedings of the WILDRE5– 5th Workshop on Indian Language Data: Resources and Evaluation

The authors present a work-in-progress in the field of Abstractive Text Summarization (ATS) for Sanskrit Prose – a first attempt at ATS for Sanskrit (SATS). We will evaluate recent approaches and methods used for ATS and argue for the ones to be adopted for Sanskrit prose considering the unique properties of the language. There are three goals of SATS - to make manuscript summaries, to enrich the semantic processing of Sanskrit, and to improve the information retrieval systems in the language. While Extractive Text Summarization (ETS) is an important method, the summaries it generates are not always coherent. For qualitative coherent summaries, ATS is considered a better option by scholars. This paper reviews various ATS/ETS approaches for Sanskrit and other Indian Languages done till date. In the preliminary overview, authors conclude that of the two available approaches - structure-based and semantic-based - the latter would be viable owing to the rich morphology of Sanskrit. Moreover, a graph-based method may also be suitable. The second suggested method is the supervised-learning method. The authors also suggest attempting cross-lingual summarization as an extension to this work in future.
Search
Co-authors
Venues