From Scanned Image to Knowledge Sharing Formats and Technologies in the Digital Mathematics Library Project

The main obstacle to easy accessing the vast amount of knowledge is the fact that they are not available in well-designed, standard, fully indexed electronic form, together with detailed metadata and full-text search capabilities. This paper is a case study of design issues in a subproject of WDML (World Digital Mathematics Library) aimed at digitizing valuable mathematical journals and books published in the Czech and Slovak Republics, to make them publicly available in digital form. We discuss here the design of the work-o w aiming at having mathematical knowledge stored in digital library. The key concept is a gradual enhancement of the digital material by 'knowledge enhancing' lters applied to the markup-rich XML data.

[1]  MacKenzie Smith,et al.  DSpace: A Year in the Life of an Open Source Digital Repository System , 2004, ECDL.

[2]  Kazem Taghva,et al.  OCRSpell: an interactive spelling correction system for OCR errors in text , 2001, International Journal on Document Analysis and Recognition.

[3]  Steve Lawren Online or invisible ? , 2001 .

[4]  Hermann A. Maurer,et al.  What we Expect from Digital Libraries , 2004, J. Univers. Comput. Sci..

[5]  Edward R. Tufte,et al.  Envisioning Information , 1990 .

[6]  Chaomei Chen,et al.  Visualizing the Semantic Web: XML-Based Internet and Information Visualization, 2nd Edition , 2004, Visualizing the Semantic Web, 2nd Edition.

[7]  James Allan,et al.  Automatic structuring and retrieval of large text files , 1994, CACM.

[8]  John Ewing Predicting the Future of Scholarly Publishing , 2002, Electronic Information and Communication in Mathematics.

[9]  Petr Sojka Publishing Encyclopaedia with Acrobat using TeX , 1998 .

[10]  Ted E. Dunning,et al.  Statistical Identification of Language , 1994 .

[11]  Norman Paskin,et al.  Digital Object Identifiers for scientific data , 2005, Data Sci. J..

[12]  Steffen Staab,et al.  An Extensible Ontology Software Environment , 2004, Handbook on Ontologies.

[13]  Robert F. Cohen,et al.  WebOFDAV - Navigating and Visualizing the Web On-Line with Animated Context Swapping , 1998, Comput. Networks.

[14]  Dan Brickley,et al.  Rdf vocabulary description language 1.0 : Rdf schema , 2004 .

[15]  Kazem Taghva,et al.  Autotag: A Tool for Creating Structured Document Collections from Printed Materials , 1998, EP.

[16]  C. Lee Giles,et al.  Digital Libraries and Autonomous Citation Indexing , 1999, Computer.