Structure Formation in the Web

In this chapter we develop a representation model of web document networks. Based on the notion of uncertain web document structures, the model is defined as a template which grasps nested manifestation levels of hypertext types. Further, we specify the model on the conceptual, formal and physical level and exemplify it by reconstructing competing web document models.

[1]  Stuart Macdonald,et al.  User Engagement in Research Data Curation , 2009, ECDL.

[2]  David G. Durand,et al.  Lessons for the World Wide Web from the Text Encoding Initiative , 1996, World Wide Web J..

[3]  Bojidar Yanev,et al.  System and Structure , 2007 .

[4]  Filippo Menczer,et al.  Lexical and semantic clustering by Web links , 2004, J. Assoc. Inf. Sci. Technol..

[5]  Jeannett Martin,et al.  English Text: System and structure , 1992 .

[6]  Andy Schürr,et al.  GXL: A graph-based standard exchange format for reengineering , 2006, Sci. Comput. Program..

[7]  Ramana Rao,et al.  Silk from a sow's ear: extracting usable structures from the Web , 1996, CHI.

[8]  Peter Ingwersen,et al.  Toward a basic framework for webometrics , 2004, J. Assoc. Inf. Sci. Technol..

[9]  Wang-Ying Lin,et al.  Content Analysis of the World Wide Web , 2000 .

[10]  Stephanie W. Haas,et al.  Readers, authors, and page structure: a discussion of four questions arising from a content analysis of Web pages , 2000 .

[11]  Donia Scott,et al.  Document Structure , 2003, CL.

[12]  Jean-Pierre Chevallet,et al.  Toward a Structured Information Retrieval System on the Web: Automatic Structure Extraction of Web Pages , 2001, WebDyn@ICDT.

[13]  Wallace Koehler,et al.  An Analysis of Web Page and Web Site Constancy and Permanence , 1999, J. Am. Soc. Inf. Sci..

[14]  Sougata Mukherjea,et al.  Organizing topic-specific web information , 2000, HYPERTEXT '00.

[15]  Mounia Lalmas,et al.  Combining Web Document Representations in a Bayesian Inference Network Model Using Link and Content-Based Evidence , 2002, ECIR.

[16]  Lada A. Adamic The Small World Web , 1999, ECDL.

[17]  Wallace Koehler,et al.  A longitudinal study of Web pages continued: a consideration of document persistence , 2003, Inf. Res..

[18]  Jörg M. Haake,et al.  Hypermedia and cognition: designing for comprehension , 1995, CACM.

[19]  Rudy Prabowo,et al.  Are raw RSS feeds suitable for broad issue scanning? A science concern case study , 2006 .

[20]  Mark Kot,et al.  Zipf's law and the diversity of biology newsgroups , 2003, Scientometrics.

[21]  Alexander Mehler,et al.  Genres on the Web: Computational Models and Empirical Studies , 2010 .

[22]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[23]  Alexander Mehler Zur textlinguistischen Fundierung der Text- und Korpus-konversion , 2005 .

[24]  Ravi Kumar,et al.  Structure and evolution of blogspace , 2004, CACM.