A Graph-Based Approach Towards Discerning Inherent Structures in a Digital Library of Formal Mathematics

As the amount of online formal mathematical content grows, for example through active efforts such as the Mathweb [21], MOWGLI [4], Formal Digital Library, or FDL [1], and others, it becomes increasingly valuable to find automated means to manage this data and capture semantics such as relatedness and significance. We apply graph-based approaches, such as HITS, or Hyperlink Induced Topic Search, [11] used for World Wide Web document search and analysis, to formal mathematical data collections. The nodes of the graphs we analyze are theorems and definitions, and the links are logical dependencies. By exploiting this link structure, we show how one may extract organizational and relatedness information from a collection of digital formal math. We discuss the value of the information we can extract, yielding potential applications in math search tools, theorem proving, and education.

[1]  Andrei Voronkov,et al.  Automated Deduction—CADE-18 , 2002, Lecture Notes in Computer Science.

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  Vladimir Batagelj,et al.  Pajek - Program for Large Network Analysis , 1999 .

[4]  Daniel W. Lozier,et al.  THE DLMF PROJECT: A NEW INITIATIVE IN CLASSICAL SPECIAL FUNCTIONS , 2000 .

[5]  Ravi Kumar,et al.  Trawling the Web for Emerging Cyber-Communities , 1999, Comput. Networks.

[6]  Olivier Pons,et al.  Dependency Graphs for Interactive Theorem Provers , 2000 .

[7]  Michael Kohlhase,et al.  OMDOC: Towards an Internet Standard for the Administration, Distribution, and Teaching of Mathematical Knowledge , 2000, AISC.

[8]  Luca Padovani,et al.  Mathematical Knowledge Management in HELM , 2003, Annals of Mathematics and Artificial Intelligence.

[9]  Mark Bickford,et al.  FDL: A Prototype Formal Digital Library , 2004 .

[10]  Andrei Z. Broder,et al.  Graph structure in the Web , 2000, Comput. Networks.

[11]  Mark Bickford,et al.  A Logic of Events , 2003 .

[12]  Michael Kohlhase OMDoc: an infrastructure for OpenMath content dictionary information , 2000, SIGS.

[13]  David A. McAllester,et al.  Automated Deduction - CADE-17 , 2000, Lecture Notes in Computer Science.

[14]  M E J Newman,et al.  Finding and evaluating community structure in networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.

[15]  Christoph Kreitz,et al.  The Nuprl Open Logical Environment , 2000, CADE.

[16]  Albert-László Barabási,et al.  Linked - how everything is connected to everything else and what it means for business, science, and everyday life , 2003 .

[17]  Michael Kohlhase,et al.  System Description: The MathWeb Software Bus for Distributed Mathematical Reasoning , 2002, CADE.

[18]  K. Ladizesky Libraries and associations in the Transient World : new technologies and new forms of cooperation , 1997 .

[19]  Michael I. Jordan,et al.  Stable algorithms for link analysis , 2001, SIGIR '01.

[20]  M. Newman,et al.  Why social networks are different from other types of networks. , 2003, Physical review. E, Statistical, nonlinear, and soft matter physics.