Distractorless Authorship Verification

John Noecker Jr, Michael Ryan


Abstract
Authorship verification is the task of, given a document and a candi- date author, determining whether or not the document was written by the candi- date author. Traditional approaches to authorship verification have revolved around a “candidate author vs. everything else” approach. Thus, perhaps the most important aspect of performing authorship verification on a document is the development of an appropriate distractor set to represent “everything not the candidate author”. The validity of the results of such experiments hinges on the ability to develop an appropriately representative set of distractor documents. Here, we propose a method for performing authorship verification without the use of a distractor set. Using only training data from the candidate author, we are able to perform authorship verification with high confidence (greater than 90% accuracy rates across a large corpus).
Anthology ID:
L12-1090
Volume:
Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12)
Month:
May
Year:
2012
Address:
Istanbul, Turkey
Venue:
LREC
SIG:
Publisher:
European Language Resources Association (ELRA)
Note:
Pages:
785–789
Language:
URL:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/238_Paper.pdf
DOI:
Bibkey:
Cite (ACL):
John Noecker Jr and Michael Ryan. 2012. Distractorless Authorship Verification. In Proceedings of the Eighth International Conference on Language Resources and Evaluation (LREC'12), pages 785–789, Istanbul, Turkey. European Language Resources Association (ELRA).
Cite (Informal):
Distractorless Authorship Verification (Noecker Jr & Ryan, LREC 2012)
Copy Citation:
PDF:
http://www.lrec-conf.org/proceedings/lrec2012/pdf/238_Paper.pdf