Acquiring a Formality-Informed Lexical Resource for Style Analysis

Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn


Abstract
To track different levels of formality in written discourse, we introduce a novel type of lexicon for the German language, with entries ordered by their degree of (in)formality. We start with a set of words extracted from traditional lexicographic resources, extend it by sentence-based similarity computations, and let crowdworkers assess the enlarged set of lexical items on a continuous informal-formal scale as a gold standard for evaluation. We submit this lexicon to an intrinsic evaluation related to the best regression models and their effect on predicting formality scores and complement our investigation by an extrinsic evaluation of formality on a German-language email corpus.
Anthology ID:
2021.eacl-main.174
Volume:
Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume
Month:
April
Year:
2021
Address:
Online
Venue:
EACL
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2028–2041
Language:
URL:
https://aclanthology.org/2021.eacl-main.174
DOI:
10.18653/v1/2021.eacl-main.174
Bibkey:
Cite (ACL):
Elisabeth Eder, Ulrike Krieg-Holz, and Udo Hahn. 2021. Acquiring a Formality-Informed Lexical Resource for Style Analysis. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, pages 2028–2041, Online. Association for Computational Linguistics.
Cite (Informal):
Acquiring a Formality-Informed Lexical Resource for Style Analysis (Eder et al., EACL 2021)
Copy Citation:
PDF:
https://preview.aclanthology.org/update-css-js/2021.eacl-main.174.pdf
Code
 ee-2/i-forger