Graph-Based Approach to Recognizing CST Relations in Polish Texts

Paweł Kędzia, Maciej Piasecki, Arkadiusz Janz


Abstract
This paper presents an supervised approach to the recognition of Cross-document Structure Theory (CST) relations in Polish texts. In the proposed, graph-based representation is constructed for sentences. Graphs are built on the basis of lexicalised syntactic-semantic relation extracted from text. Similarity between sentences is calculated from graph, and the similarity values are input to classifiers trained by Logistic Model Tree. Several different configurations of graph, as well as graph similarity methods were analysed for this tasks. The approach was evaluated on a large open corpus annotated manually with 17 types of selected CST relations. The configuration of experiments was similar to those known from SEMEVAL and we obtained very promising results.
Anthology ID:
R17-1048
Volume:
Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017
Month:
September
Year:
2017
Address:
Varna, Bulgaria
Venue:
RANLP
SIG:
Publisher:
INCOMA Ltd.
Note:
Pages:
363–371
Language:
URL:
https://doi.org/10.26615/978-954-452-049-6_048
DOI:
10.26615/978-954-452-049-6_048
Bibkey:
Cite (ACL):
Paweł Kędzia, Maciej Piasecki, and Arkadiusz Janz. 2017. Graph-Based Approach to Recognizing CST Relations in Polish Texts. In Proceedings of the International Conference Recent Advances in Natural Language Processing, RANLP 2017, pages 363–371, Varna, Bulgaria. INCOMA Ltd..
Cite (Informal):
Graph-Based Approach to Recognizing CST Relations in Polish Texts (Kędzia et al., RANLP 2017)
Copy Citation:
PDF:
https://doi.org/10.26615/978-954-452-049-6_048