Authorship Attribution By Consensus Among Multiple Features

Jagadeesh Patchala, Raj Bhatnagar


Abstract
Most existing research on authorship attribution uses various lexical, syntactic and semantic features. In this paper we demonstrate an effective template-based approach for combining various syntactic features of a document for authorship analysis. The parse-tree based features that we propose are independent of the topic of a document and reflect the innate writing styles of authors. We show that the use of templates including sub-trees of parse trees in conjunction with other syntactic features result in improved author attribution rates. Another contribution is the demonstration that Dempster’s rule based combination of evidence from syntactic features performs better than other evidence-combination methods. We also demonstrate that our methodology works well for the case where actual author is not included in the candidate author set.
Anthology ID:
C18-1234
Volume:
Proceedings of the 27th International Conference on Computational Linguistics
Month:
August
Year:
2018
Address:
Santa Fe, New Mexico, USA
Editors:
Emily M. Bender, Leon Derczynski, Pierre Isabelle
Venue:
COLING
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
2766–2777
Language:
URL:
https://aclanthology.org/C18-1234
DOI:
Bibkey:
Cite (ACL):
Jagadeesh Patchala and Raj Bhatnagar. 2018. Authorship Attribution By Consensus Among Multiple Features. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2766–2777, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
Cite (Informal):
Authorship Attribution By Consensus Among Multiple Features (Patchala & Bhatnagar, COLING 2018)
Copy Citation:
PDF:
https://preview.aclanthology.org/naacl-24-ws-corrections/C18-1234.pdf