Mining Graph Patterns in Web-based Systems: A Conceptual View

The task of applying Data Mining methods [38] to web-based hypertexts is often referred to as Web Mining [16]. In view of the steadily increasing complexity of web data sources and the huge amount of information available online, Web Mining has been an important and fruitful research topic [16, 46]. Generally, Web Mining can be divided into the following categories:

[1]  Horst Bunke,et al.  A New Algorithm for Error-Tolerant Subgraph Isomorphism Detection , 1998, IEEE Trans. Pattern Anal. Mach. Intell..

[2]  M. Dehmer,et al.  Analysis of Complex Networks: From Biology to Linguistics , 2009 .

[3]  Eli Upfal,et al.  The Web as a graph , 2000, PODS.

[4]  Tanja Gesell,et al.  A comparative analysis of multidimensional features of objects resembling sets of graphs , 2008, Appl. Math. Comput..

[5]  Erhard Rahm,et al.  Kurz erklärt - Web Usage Mining , 2002, Datenbank-Spektrum.

[6]  Soumen Chakrabarti,et al.  Mining the web - discovering knowledge from hypertext data , 2002 .

[7]  Matthias Dehmer,et al.  Strukturelle Analyse web-basierter Dokumente , 2005 .

[8]  Matthias Dimter Textklassenkonzepte heutiger Alltagssprache , 1981 .

[9]  Matthias Dehmer,et al.  Information theoretic measures of UHG graphs with low computational complexity , 2007, Appl. Math. Comput..

[10]  Tao Jiang,et al.  Alignment of Trees - An Alternative to Tree Edit , 1994, CPM.

[11]  Alexander Mehler,et al.  Genres on the Web: Computational Models and Empirical Studies , 2010 .


[13]  Matthias Dehmer,et al.  Information processing in complex networks: Graph entropy and information functionals , 2008, Appl. Math. Comput..

[14]  Frank Harary,et al.  Graph Theory , 2016 .

[15]  A. Barabasi,et al.  Network biology: understanding the cell's functional organization , 2004, Nature Reviews Genetics.

[16]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[17]  Rick Kazman,et al.  WebQuery: Searching and Visualizing the Web Through Connectivity , 1997, Comput. Networks.

[18]  Albert-László Barabási,et al.  Internet: Diameter of the World-Wide Web , 1999, Nature.

[19]  O Mason,et al.  Graph theory and networks in Biology. , 2006, IET systems biology.

[20]  Lada A. Adamic,et al.  Internet: Growth dynamics of the World-Wide Web , 1999, Nature.

[21]  Donia Scott,et al.  Document Structure , 2003, CL.

[22]  Sergey N. Dorogovtsev,et al.  Evolution of Networks: From Biological Nets to the Internet and WWW (Physics) , 2003 .

[23]  David Buttler,et al.  A Short Survey of Document Structure Similarity Algorithms , 2004, International Conference on Internet Computing.

[24]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[25]  Kuo-Chung Tai,et al.  The Tree-to-Tree Correction Problem , 1979, JACM.

[26]  Elio Masciari,et al.  Detecting Structural Similarities between XML Documents , 2002, WebDB.

[27]  Alexander Mehler,et al.  Towards Structure-sensitive Hypertext Categorization , 2005, GfKl.

[28]  Matthias Dehmer,et al.  Structural similarity of directed universal hierarchical graphs: A low computational complexity approach , 2007, Appl. Math. Comput..

[29]  Alexander Mehler,et al.  Generalized Shortest Paths Trees: A Novel Graph Class Applied to Semiotic Networks , 2009 .

[30]  Alexander Mehler,et al.  Measuring the Structural Similarity of Web-based Documents: A Novel Approach , 2007 .

[31]  Matthias Dehmer,et al.  A similarity measure for graphs with low computational complexity , 2006, Appl. Math. Comput..

[32]  Georg Rehm Hypertextsorten: Definition - Struktur - Klassifikation , 2006 .

[33]  Danail Bonchev,et al.  Information theoretic indices for characterization of chemical structures , 1983 .

[34]  T. Richter,et al.  LOGPAT: A semi-automatic way to analyze hypertext navigation behavior , 2003 .

[35]  Vladimir Batagelj,et al.  Similarity measures between structured objects , 1989 .

[36]  Andrew Tomkins,et al.  Dense Subgraph Extraction , 2006 .

[37]  Myra Spiliopoulou,et al.  Web usage mining for Web site evaluation , 2000, CACM.

[38]  Horst Bunke,et al.  Recent developments in graph matching , 2000, Proceedings 15th International Conference on Pattern Recognition. ICPR-2000.

[39]  Lars Littig,et al.  Classification of Web Sites at Super-genre Level , 2011, Genres on the Web.

[40]  Kurt Varmuza,et al.  Clustering and similarity of chemical structures represented by binary substructure descriptors , 2003 .

[41]  Alexander Mehler,et al.  Towards Logical Hypertext Structure A Graph-Theoretic Perspective , 2006 .

[42]  L. Foulds Graph Theory Applications , 1991 .

[43]  M. V. Valkenburg Network Analysis , 1964 .

[44]  Sachindra Joshi,et al.  A bag of paths model for measuring structural similarity in Web documents , 2003, KDD '03.

[45]  Vladimir I. Levenshtein,et al.  Binary codes capable of correcting deletions, insertions, and reversals , 1965 .

[46]  Prabhakar Raghavan,et al.  Graph Structure of the Web: A Survey , 2000, LATIN.

[47]  Subhash C. Basak,et al.  Determining structural similarity of chemicals using graph-theoretic indices , 1988, Discret. Appl. Math..

[48]  Bohdan Zelinka,et al.  On a certain distance between isomorphism classes of graphs , 1975 .

[49]  Lawrence B. Holder,et al.  Mining Graph Data , 2006 .

[50]  Stefan Bornholdt,et al.  Handbook of Graphs and Networks: From the Genome to the Internet , 2003 .

[51]  Alexander Mehler,et al.  A Two-level Approach to Web Genre Classification , 2009, WEBIST.

[52]  Alexander Mehler,et al.  Generalized Shortest Path Trees: A Novel Graph Class by Example of Semiotic Networks , 2009 .

[53]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[54]  Suzanne M. Ward Books on Demand , 2002 .

[55]  Soumen Chakrabarti,et al.  Integrating the document object model with hyperlinks for enhanced topic distillation and information extraction , 2001, WWW '01.

[56]  Alexander Mehler,et al.  Aspekte der Kategorisierung von Webseiten , 2004, GI Jahrestagung.

[57]  Alexander Mehler Structure Formation in the Web , 2010 .

[58]  S. N. Dorogovtsev,et al.  Evolution of networks , 2001, cond-mat/0106144.

[59]  Jiang He Web Usage Mining , 2002 .

[60]  Stanley M. Selkow,et al.  The Tree-to-Tree Editing Problem , 1977, Inf. Process. Lett..