Adding machine learning and knowledge intensive techniques to a digital library service

This paper presents IDL, a prototypical digital library service. It integrates machine learning tools and intelligent techniques in order to make effective, efficient and economically feasible the process of capturing the information that should be stored and indexed by content in the digital library. In fact, information capture and semantic indexing are critical issues when building a digital library, since they involve complex pattern recognition problems, such as document analysis, classiffication and understanding. Experimental results show that learning systems can effectively and efficiently solve all these problems.

[1]  Donato Malerba,et al.  Multistrategy Learning for Document Recognition , 1994, Appl. Artif. Intell..

[2]  Donato Malerba,et al.  Automated acquisition of rules for document understanding , 1993, Proceedings of 2nd International Conference on Document Analysis and Recognition (ICDAR '93).

[3]  Ryszard S. Michalski,et al.  Pattern Recognition as Rule-Guided Inductive Inference , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Donato Malerba,et al.  IDL: A Prototypical Intelligent Digital Library Service , 1997, AI*IA.

[5]  Frank Y. Shih,et al.  Adaptive document block segmentation and classification , 1996, IEEE Trans. Syst. Man Cybern. Part B.

[6]  Donato Malerba,et al.  Revision of Logical Theories , 1995, AI*IA.

[7]  Donato Malerba,et al.  An experimental page layout recognition system for office document automatic classification: an integrated approach for inductive generalization , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[8]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[9]  Yuan Yan Tang,et al.  Document Processing for Automatic Knowledge Acquisition , 1994, IEEE Trans. Knowl. Data Eng..

[10]  Donato Malerba,et al.  A knowledge-based approach to the layout analysis , 1995, Proceedings of 3rd International Conference on Document Analysis and Recognition.

[11]  Friedrich M. Wahl,et al.  Document Analysis System , 1982, IBM J. Res. Dev..

[12]  Nicola Fanizzi,et al.  An adaptive visual environment for digital libraries , 1999, International Journal on Digital Libraries.

[13]  Sargur N. Srihari,et al.  Classification of newspaper image blocks using texture analysis , 1989, Comput. Vis. Graph. Image Process..

[14]  S.C. Hinds,et al.  A rule-based system for document image segmentation , 1990, [1990] Proceedings. 10th International Conference on Pattern Recognition.

[15]  Paul E. Utgoff,et al.  An Improved Algorithm for Incremental Induction of Decision Trees , 1994, ICML.

[16]  Henry Lieberman,et al.  Letizia: An Agent That Assists Web Browsing , 1995, IJCAI.

[17]  Edward A. Fox,et al.  How to Make Intelligent Digital Libraries , 1994, ISMIS.

[18]  Donato Malerba,et al.  Machine Learning + On-line Libraries = IDL , 1997, ECDL.

[19]  Donato Malerba,et al.  Knowledge Revision for Document Understanding , 1997, ISMIS.

[20]  Donato Malerba,et al.  A Multistrategy Approach to Learning Multiple Dependent Concepts , 1996 .

[21]  Donato Malerba,et al.  A Comparative Analysis of Methods for Pruning Decision Trees , 1997, IEEE Trans. Pattern Anal. Mach. Intell..

[22]  Peter Weinstein,et al.  Seed ontologies: growing digital libraries as distributed, intelligent systems , 1997, DL '97.

[23]  Donato Malerba,et al.  Processing Paper Documents with WISDOM , 1997, AI*IA.

[24]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[25]  G. Ciardiello,et al.  An experimental system for office document handling and text recognition , 1988 .

[26]  Douglas H. Fisher,et al.  A Case Study of Incremental Concept Induction , 1986, AAAI.

[27]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[28]  Vassilis Moustakis,et al.  Where Do Machine Learning and Human-Computer Interaction Meet? , 1997, Appl. Artif. Intell..