Abstract
For the analysis of contract texts, validated model texts, such as model clauses, can be used to identify reused contract clauses. This paper investigates how to calculate the similarity between titles of model clauses and headings extracted from contracts, and which similarity measure is most suitable for this. For the calculation of the similarities between title pairs we tested various variants of string similarity and token based similarity. We also compare two more semantic similarity measures based on word embeddings using pretrained embeddings and word embeddings trained on contract texts. The identification of the model clause title can be used as a starting point for the mapping of clauses found in contracts to verified clauses.- Anthology ID:
- W19-0803
- Volume:
- RELATIONS - Workshop on meaning relations between phrases and sentences
- Month:
- May
- Year:
- 2019
- Address:
- Gothenburg, Sweden
- Editors:
- Venelin Kovatchev, Darina Gold, Torsten Zesch
- Venue:
- IWCS
- SIG:
- SIGSEM
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- Language:
- URL:
- https://aclanthology.org/W19-0803
- DOI:
- 10.18653/v1/W19-0803
- Cite (ACL):
- Frieda Josi, Christian Wartena, and Ulrich Heid. 2019. Detecting Paraphrases of Standard Clause Titles in Insurance Contracts. In RELATIONS - Workshop on meaning relations between phrases and sentences, Gothenburg, Sweden. Association for Computational Linguistics.
- Cite (Informal):
- Detecting Paraphrases of Standard Clause Titles in Insurance Contracts (Josi et al., IWCS 2019)
- PDF:
- https://preview.aclanthology.org/nschneid-patch-4/W19-0803.pdf