Text Linkage in the Wiki Medium - A Comparative Study

We analyze four different types of document networks with respect to their small world characteristics. These characteristics allow distinguishing wiki-based systems from citation and more traditional text-based networks augmented by hyperlinks. The study provides evidence that a more appropriate network model is needed which better reflects the specifics of wiki systems. It puts emphasize on their topological differences as a result of wiki-related linking compared to other text-based networks.

[1]  Nancy A. Chinchor,et al.  Overview of MUC-7 , 1998, MUC.

[2]  Béla Bollobás,et al.  Random Graphs , 1985 .

[3]  J. Voß Measuring Wikipedia , 2005 .

[4]  Mark A. Rosso Using genre to improve web search , 2005 .

[5]  Mark Sanderson,et al.  The SPIRIT collection: an overview of a large web collection , 2004, SIGF.

[6]  Alexander Mehler,et al.  The Net for the Graphs : Towards Webgenre Representation for Corpus Linguistic Studies , 2006 .

[7]  Duncan J. Watts,et al.  Collective dynamics of ‘small-world’ networks , 1998, Nature.

[8]  G. Miller,et al.  Contextual correlates of semantic similarity , 1991 .

[9]  Albert,et al.  Emergence of scaling in random networks , 1999, Science.

[10]  Wei Gao,et al.  NIL Is Not Nothing: Recognition of Chinese Network Informal Language Expressions , 2005, IJCNLP.

[11]  Paul Clough,et al.  Using heterogeneous annotation and visual information for the benchmarking of image retrieval systems , 2006, Electronic Imaging.

[12]  Kevin S. McCurley,et al.  Untangling compound documents on the web , 2003, HYPERTEXT '03.

[13]  George Herman,et al.  Coordinating information using genres , 2003 .

[14]  I. Dan Melamed,et al.  A Geometric Approach to Mapping Bitext Correspondence , 1996, EMNLP.

[15]  Murray Turoff,et al.  Hypertext functionality: A theoretical framework , 1990, Int. J. Hum. Comput. Interact..

[16]  Marc Moens,et al.  Description of the LTG System Used for MUC-7 , 1998, MUC.

[17]  Xavier Carreras,et al.  FreeLing: An Open-Source Suite of Language Analyzers , 2004, LREC.

[18]  M. Lynn Hawaii International Conference on System Sciences , 1996 .

[19]  M. de Rijke,et al.  Discovering missing links in Wikipedia , 2005, LinkKDD '05.

[20]  Janis Wolak,et al.  Online victimization: A report on the nation’s youth. , 2000 .

[21]  John M. Swales,et al.  Genre Analysis: English in Academic and Research Settings , 1993 .

[22]  Lada A. Adamic The Small World Web , 1999, ECDL.

[23]  Bernardo Magnini,et al.  Using WordNet Predicates for Multilingual Named Entity Recognition , 2004 .

[24]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.

[25]  Anita Pincas,et al.  Report into the use of Chat in education , 2006 .

[26]  Manfred Görlach Text types and the history of English , 2004 .

[27]  Andrea Ciffolilli,et al.  Phantom authority, self-selective recruitment and retention of members in virtual communities: The case of Wikipedia , 2003, First Monday.

[28]  Douglas Biber,et al.  Dimensions of Register Variation: A Cross-Linguistic Comparison , 1995 .

[29]  Yiming Yang,et al.  Hypertext Categorization using Hyperlink Patterns and Meta Data , 2001, ICML.

[30]  Julio Gonzalo,et al.  The CLEF 2002 Interactive Track , 2002, CLEF.

[31]  Ramon Ferrer i Cancho,et al.  The small world of human language , 2001, Proceedings of the Royal Society of London. Series B: Biological Sciences.

[32]  Joshua B. Tenenbaum,et al.  The Large-Scale Structure of Semantic Networks: Statistical Analyses and a Model of Semantic Growth , 2001, Cogn. Sci..

[33]  Piek T. J. M. Vossen,et al.  Introduction to EuroWordNet , 1998, Comput. Humanit..

[34]  Gilad Mishne,et al.  Using Wikipedia at the TREC QA Track , 2004, TREC.

[35]  B. Kwasnik Genres of Digital Documents: Introduction to the Special Issue , 2005 .

[36]  David Carmel,et al.  The connectivity sonar: detecting site functionality by structural patterns , 2003, HYPERTEXT '03.

[37]  Kam-Fai Wong,et al.  A Two-Stage Incremental Annotation Approach to Constructing a Network Informal Language Corpus , 2005, NTCIR.

[38]  Olga Uryupina Semi-supervised learning of geographical gazetteers from the internet , 2003, HLT-NAACL 2003.

[39]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[40]  David Evans,et al.  Identifying similarity in text: multi-lingual analysis for summarization , 2005 .

[41]  Bernardo Magnini,et al.  A WordNet-Based Approach to Named Entites Recognition , 2022 .

[42]  Mark E. J. Newman,et al.  The Structure and Function of Complex Networks , 2003, SIAM Rev..

[43]  Marina Santini,et al.  Genres in formation? An exploratory study of web pages using cluster analysis , 2005 .

[44]  Donia Scott,et al.  Document Structure , 2003, CL.

[45]  Duncan J. Watts,et al.  Six Degrees: The Science of a Connected Age , 2003 .

[46]  Andrew Lih,et al.  Wikipedia as Participatory Journalism: Reliable Sources? Metrics for evaluating collaborative media as a news resource , 2004 .

[47]  Robert L. Mercer,et al.  Class-Based n-gram Models of Natural Language , 1992, CL.

[48]  Martin Wattenberg,et al.  Studying cooperation and conflict between authors with history flow visualizations , 2004, CHI.

[49]  Julio Gonzalo,et al.  Overview of the CLEF 2005 Interactive Track , 2005, CLEF.

[50]  Sharon L. Milgram,et al.  The Small World Problem , 1967 .

[51]  Benno Stein,et al.  Genre classification of Web pages user study and feasibility analysis , 2004 .

[52]  Andrew Dillon,et al.  Genres and the WEB: Is the personal home page the first uniquely digital genre? , 2000, J. Am. Soc. Inf. Sci..

[53]  Mark Sanderson,et al.  The CLEF 2004 Cross-Language Image Retrieval Track , 2004, CLEF.

[54]  Jussi Karlgren,et al.  Recognizing Text Genres With Simple Metrics Using Discriminant Analysis , 1994, COLING.

[55]  Jussi Karlgren,et al.  The Wheres and Whyfores for Studying Textual Genre Computationally , 2004, AAAI Technical Report.

[56]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.