Modelling Uncertainty in Collaborative Document Quality Assessment

In the context of document quality assessment, previous work has mainly focused on predicting the quality of a document relative to a putative gold standard, without paying attention to the subjectivity of this task. To imitate people’s disagreement over inherently subjective tasks such as rating the quality of a Wikipedia article, a document quality assessment system should provide not only a prediction of the article quality but also the uncertainty over its predictions. This motivates us to measure the uncertainty in document quality predictions, in addition to making the label prediction. Experimental results show that both Gaussian processes (GPs) and random forests (RFs) can yield competitive results in predicting the quality of Wikipedia articles, while providing an estimate of uncertainty when there is inconsistency in the quality labels from the Wikipedia contributors. We additionally evaluate our methods in the context of a semi-automated document quality class assignment decision-making process, where there is asymmetric risk associated with overestimates and underestimates of document quality. Our experiments suggest that GPs provide more reliable estimates in this context.

[1]  Sergey Ioffe,et al.  Rethinking the Inception Architecture for Computer Vision , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[2]  Benno Stein,et al.  Identifying featured articles in wikipedia: writing style matters , 2010, WWW '10.

[3]  Claudia-Lavinia Ignat,et al.  An end-to-end learning solution for assessing the quality of Wikipedia articles , 2017, OpenSym.

[4]  J. Chall,et al.  Readability revisited : the new Dale-Chall readability formula , 1995 .

[5]  Claudia-Lavinia Ignat,et al.  Measuring Quality of Collaboratively Edited Documents: The Case of Wikipedia , 2016, 2016 IEEE 2nd International Conference on Collaboration and Internet Computing (CIC).

[6]  Timothy Baldwin,et al.  A Hybrid Model for Quality Assessment of Wikipedia Articles , 2017, ALTA.

[7]  Claudia-Lavinia Ignat,et al.  Quality assessment of Wikipedia articles without feature engineering , 2016, 2016 IEEE/ACM Joint Conference on Digital Libraries (JCDL).

[8]  Iain Murray,et al.  A framework for evaluating approximation methods for Gaussian process regression , 2012, J. Mach. Learn. Res..

[9]  Timothy Baldwin,et al.  A Joint Model for Multimodal Document Quality Assessment , 2019, 2019 ACM/IEEE Joint Conference on Digital Libraries (JCDL).

[10]  Pável Calado,et al.  Quality assessment of collaborative content with minimal information , 2014, IEEE/ACM Joint Conference on Digital Libraries.

[11]  Bob Carpenter,et al.  The Benefits of a Model of Annotation , 2013, Transactions of the Association for Computational Linguistics.

[12]  Les Gasser,et al.  Assessing Information Quality of a Community-Based Encyclopedia , 2005, ICIQ.

[13]  Daniel Beck Modelling Representation Noise in Emotion Analysis using Gaussian Processes , 2017, IJCNLP.

[14]  R. P. Fishburne,et al.  Derivation of New Readability Formulas (Automated Readability Index, Fog Count and Flesch Reading Ease Formula) for Navy Enlisted Personnel , 1975 .

[15]  Loren G. Terveen,et al.  The Success and Failure of Quality Improvement Projects in Peer Production Communities , 2015, CSCW.

[16]  Lucia Specia,et al.  Modelling Annotator Bias with Multi-task Gaussian Processes: An Application to Machine Translation Quality Estimation , 2013, ACL.

[17]  Trevor Cohn,et al.  Learning Kernels over Strings using Gaussian Processes , 2017, IJCNLP.

[18]  Jeffrey Pennington,et al.  GloVe: Global Vectors for Word Representation , 2014, EMNLP.

[19]  Pável Calado,et al.  A general multiview framework for assessing the quality of collaboratively created content on web 2.0 , 2017, J. Assoc. Inf. Sci. Technol..

[20]  Kevin Leyton-Brown,et al.  Sequential Model-Based Optimization for General Algorithm Configuration , 2011, LION.

[21]  Gaël Varoquaux,et al.  Scikit-learn: Machine Learning in Python , 2011, J. Mach. Learn. Res..

[22]  R. Wolf A Framework for Evaluation. , 1987 .

[23]  Klaus Stein,et al.  Does it matter who contributes: a study on featured articles in the german wikipedia , 2007, HT '07.

[24]  R. Gunning The Fog Index After Twenty Years , 1969 .

[25]  Carl E. Rasmussen,et al.  Distributed Variational Inference in Sparse Gaussian Process Regression and Latent Variable Models , 2014, NIPS.

[26]  Alexis Boukouvalas,et al.  GPflow: A Gaussian Process Library using TensorFlow , 2016, J. Mach. Learn. Res..

[27]  Quoc V. Le,et al.  Distributed Representations of Sentences and Documents , 2014, ICML.

[28]  Dirk Hovy,et al.  Learning Whom to Trust with MACE , 2013, NAACL.

[29]  Pável Calado,et al.  Automatic quality assessment of content created collaboratively by web communities: a case study of wikipedia , 2009, JCDL '09.

[30]  Udo Kruschwitz,et al.  Comparing Bayesian Models of Annotation , 2018, TACL.

[31]  John Riedl,et al.  Tell me more: an actionable quality model for Wikipedia , 2013, OpenSym.

[32]  Lucia Specia,et al.  Exploring Prediction Uncertainty in Machine Translation Quality Estimation , 2016, CoNLL.

[33]  Ryan P. Adams,et al.  Probabilistic Backpropagation for Scalable Learning of Bayesian Neural Networks , 2015, ICML.

[34]  Jürgen Schmidhuber,et al.  Long Short-Term Memory , 1997, Neural Computation.

[35]  Lawrence K. Saul,et al.  Kernel Methods for Deep Learning , 2009, NIPS.

[36]  Neil D. Lawrence,et al.  Gaussian Processes for Big Data , 2013, UAI.

[37]  Joshua Evan Blumenstock,et al.  Size matters: word count as a measure of quality on wikipedia , 2008, WWW.

[38]  Krishnendu Chatterjee,et al.  Assigning trust to Wikipedia content , 2008, Int. Sym. Wikis.

[39]  Timothy Baldwin,et al.  Can machine translation systems be evaluated by the crowd alone , 2015, Natural Language Engineering.

[40]  Michalis K. Titsias,et al.  Variational Learning of Inducing Variables in Sparse Gaussian Processes , 2009, AISTATS.

[41]  Leo Breiman,et al.  Random Forests , 2001, Machine Learning.

[42]  F. Diebold,et al.  Optimal Prediction Under Asymmetric Loss , 1994, Econometric Theory.

[43]  Carl E. Rasmussen,et al.  Evaluating Predictive Uncertainty Challenge , 2005, MLCW.