Abstract
Most existing research on authorship attribution uses various lexical, syntactic and semantic features. In this paper we demonstrate an effective template-based approach for combining various syntactic features of a document for authorship analysis. The parse-tree based features that we propose are independent of the topic of a document and reflect the innate writing styles of authors. We show that the use of templates including sub-trees of parse trees in conjunction with other syntactic features result in improved author attribution rates. Another contribution is the demonstration that Dempster’s rule based combination of evidence from syntactic features performs better than other evidence-combination methods. We also demonstrate that our methodology works well for the case where actual author is not included in the candidate author set.- Anthology ID:
- C18-1234
- Volume:
- Proceedings of the 27th International Conference on Computational Linguistics
- Month:
- August
- Year:
- 2018
- Address:
- Santa Fe, New Mexico, USA
- Editors:
- Emily M. Bender, Leon Derczynski, Pierre Isabelle
- Venue:
- COLING
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 2766–2777
- Language:
- URL:
- https://aclanthology.org/C18-1234
- DOI:
- Cite (ACL):
- Jagadeesh Patchala and Raj Bhatnagar. 2018. Authorship Attribution By Consensus Among Multiple Features. In Proceedings of the 27th International Conference on Computational Linguistics, pages 2766–2777, Santa Fe, New Mexico, USA. Association for Computational Linguistics.
- Cite (Informal):
- Authorship Attribution By Consensus Among Multiple Features (Patchala & Bhatnagar, COLING 2018)
- PDF:
- https://preview.aclanthology.org/naacl-24-ws-corrections/C18-1234.pdf