Testing the validity of Wikipedia categories for subject matter labelling of open-domain corpus data

The Wikipedia category system was designed to enable browsing and navigation of Wikipedia. It is also a useful resource for knowledge organisation and document indexing, especially using automatic ...

[1]  Simone Paolo Ponzetto,et al.  Taxonomy induction based on a collaboratively built knowledge repository , 2011, Artif. Intell..

[2]  Cheng Gao,et al.  Need to Categorize: A Comparative Look at the Categories of Universal Decimal Classification System and Wikipedia , 2011, Leonardo.

[3]  David W. McDonald,et al.  Tagging Wikipedia: collaboratively creating a category system , 2012, GROUP.

[4]  Marçal Mora Cantallops,et al.  A systematic literature review on Wikidata , 2019, Data Technol. Appl..

[5]  Jane Greenberg,et al.  Functionalities for automatic metadata generation applications: a survey of metadata experts' opinions , 2006, Int. J. Metadata Semant. Ontologies.

[6]  J. R. Landis,et al.  The measurement of observer agreement for categorical data. , 1977, Biometrics.

[7]  Gavriel Salvendy,et al.  Hierarchical Menu Design: Breadth, Depth, and Task Complexity , 1996 .

[8]  Klaus Krippendorff,et al.  Answering the Call for a Standard Reliability Measure for Coding Data , 2007 .

[9]  A. Karch,et al.  Measuring inter-rater reliability for nominal data – which coefficients and confidence intervals are appropriate? , 2016, BMC Medical Research Methodology.

[10]  Michael Ustaszewski,et al.  TransBank: Metadata as the Missing Link between NLP and Traditional Translation Studies , 2017 .

[11]  Cheng Gao,et al.  Evolution of Wikipedia's Category Structure , 2012, ArXiv.

[12]  Jennifer Trant,et al.  Studying Social Tagging and Folksonomy: A Review and Framework , 2009, J. Digit. Inf..

[13]  Klaus Krippendorff,et al.  Content Analysis: An Introduction to Its Methodology , 1980 .

[14]  W. J. Hutchins The concept of “aboutness” in subject indexing , 1997 .

[15]  Finn Årup Nielsen,et al.  Excavating the mother lode of human-generated text: A systematic review of research that uses the wikipedia corpus , 2017, Inf. Process. Manag..

[16]  Jesús Tramullas,et al.  Wikipedia categories in research: towards a qualitative review of uses and applications , 2018 .