Modelling Uncertainty in Collaborative Document Quality Assessment

Aili Shen, Daniel Beck, Bahar Salehi, Jianzhong Qi, Timothy Baldwin


Abstract
In the context of document quality assessment, previous work has mainly focused on predicting the quality of a document relative to a putative gold standard, without paying attention to the subjectivity of this task. To imitate people’s disagreement over inherently subjective tasks such as rating the quality of a Wikipedia article, a document quality assessment system should provide not only a prediction of the article quality but also the uncertainty over its predictions. This motivates us to measure the uncertainty in document quality predictions, in addition to making the label prediction. Experimental results show that both Gaussian processes (GPs) and random forests (RFs) can yield competitive results in predicting the quality of Wikipedia articles, while providing an estimate of uncertainty when there is inconsistency in the quality labels from the Wikipedia contributors. We additionally evaluate our methods in the context of a semi-automated document quality class assignment decision-making process, where there is asymmetric risk associated with overestimates and underestimates of document quality. Our experiments suggest that GPs provide more reliable estimates in this context.
Anthology ID:
D19-5525
Volume:
Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019)
Month:
November
Year:
2019
Address:
Hong Kong, China
Editors:
Wei Xu, Alan Ritter, Tim Baldwin, Afshin Rahimi
Venue:
WNUT
SIG:
Publisher:
Association for Computational Linguistics
Note:
Pages:
191–201
Language:
URL:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/D19-5525/
DOI:
10.18653/v1/D19-5525
Bibkey:
Cite (ACL):
Aili Shen, Daniel Beck, Bahar Salehi, Jianzhong Qi, and Timothy Baldwin. 2019. Modelling Uncertainty in Collaborative Document Quality Assessment. In Proceedings of the 5th Workshop on Noisy User-generated Text (W-NUT 2019), pages 191–201, Hong Kong, China. Association for Computational Linguistics.
Cite (Informal):
Modelling Uncertainty in Collaborative Document Quality Assessment (Shen et al., WNUT 2019)
Copy Citation:
PDF:
https://preview.aclanthology.org/sigedu-bea-out-of-sync-correction/D19-5525.pdf