Towards web documents quality assessment for digital humanities scholars

We present a framework for assessing the quality of Web documents, and a baseline of three quality dimensions: trustworthiness, objectivity and basic scholarly quality. Assessing Web document quality is a "deep data" problem necessitating approaches to handle both data size and complexity.

[1]  Hongwei Zhu,et al.  Collaboratively Assessing Information Quality on the Web , 2012, AMCIS.

[2]  Johanna Völker,et al.  Knowledge Engineering and Knowledge Management , 2012, Lecture Notes in Computer Science.

[3]  Olaf Hartig,et al.  Using Web Data Provenance for Quality Assessment , 2009, SWPM.

[4]  M. Petró‐Turza,et al.  The International Organization for Standardization. , 2003 .

[5]  Léon Bottou,et al.  Stochastic Learning , 2003, Advanced Lectures on Machine Learning.

[6]  Pável Calado,et al.  Automatic Assessment of Document Quality in Web Collaborative Digital Libraries , 2011, JDIQ.

[7]  The Semantic Web - ISWC 2014 - 13th International Semantic Web Conference, Riva del Garda, Italy, October 19-23, 2014. Proceedings, Part I , 2014, International Semantic Web Conference.

[8]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[9]  Patrick Lambrix,et al.  Knowledge Engineering and Knowledge Management: 19th International Conference, EKAW 2014 , 2014 .

[10]  Lora Aroyo,et al.  CrowdTruth: Machine-Human Computation Framework for Harnessing Disagreement in Gathering Annotated Data , 2014, SEMWEB.

[11]  Jens Lehmann,et al.  Quality assessment for Linked Data: A Survey , 2015, Semantic Web.

[12]  Lora Aroyo,et al.  Nichesourcing: Harnessing the Power of Crowds of Experts , 2012, EKAW.

[13]  Diane M. Strong,et al.  AIMQ: a methodology for information quality assessment , 2002, Inf. Manag..

[14]  Guido Caldarelli,et al.  Science vs Conspiracy: Collective Narratives in the Age of Misinformation , 2014, PloS one.

[15]  Benno Stein,et al.  Predicting quality flaws in user-generated content: the case of wikipedia , 2012, SIGIR '12.

[16]  Wan Fokkink,et al.  Predicting Quality of Crowdsourced Annotations Using Graph Kernels , 2015, IFIPTM.

[17]  M. Howell,et al.  From Reliable Sources: An Introduction to Historical Methods , 2001 .

[18]  Peter Jan Schellens,et al.  Toward a document evaluation methodology: what does research tell us about the validity and reliability of evaluation methods? , 2000 .

[19]  Paul T. Groth,et al.  Combining User Reputation and Provenance Analysis for Trust Assessment , 2016, ACM J. Data Inf. Qual..