Browsing mixed structured and unstructured data

Both structured and unstructured data, as well as structured data representing several different types of tuples, may be integrated into a single list for browsing or retrieval. Data may be arranged in the Gray code order of the features and metadata, producing optimal ordering for browsing. We provide several metrics for evaluating the performance of systems supporting browsing, given some constraints. Metadata and indexing terms are used for sorting keys and attributes for structured data, as well as for semi-structured or unstructured documents, images, media, etc. Economic and information theoretic models are suggested that enable the ordering to adapt to user preferences. Different relational structures and unstructured data may be integrated into a single, optimal ordering for browsing or for displaying tables in digital libraries, database management systems, or information retrieval systems. Adaptive displays of data are discussed.

[1]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[2]  Alexa T. McCray,et al.  Extending the role of metadata in a digital library system , 1999, Proceedings IEEE Forum on Research and Technology Advances in Digital Libraries.

[3]  Robert M. Losee Optimal User-Centered Knowledge Organization and Classification Systems: Using Non-reflected Gray Codes , 2002, J. Digit. Inf..

[4]  Jane Greenberg,et al.  Metadata and the world wide web , 2002 .

[5]  Carolyn M. Hall,et al.  Encyclopedia of Library and Information Science , 1971 .

[6]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[7]  Marcelo Arenas,et al.  An information-theoretic approach to normal forms for relational and XML data , 2003, PODS.

[8]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[9]  H. V. Jagadzsh Linear Clustering of Objects with Multiple Attributes , 1998 .

[10]  Robert M. Losee,et al.  Parameter Estimation for Probabilistic Document-Retrieval Models. , 1988 .

[11]  Robert M. Losee,et al.  The Science of Information: Measurement and Applications , 1990 .

[12]  K. M. George,et al.  Proceedings of the 1998 ACM symposium on Applied Computing, SAC'98, Atlanta, GA, USA, February 27 - March 1, 1998 , 1998, SAC.

[13]  Robert M. Losee Browsing Document Collections: Automatically Organizing Digital Libraries and Hypermedia using the Gray Code , 1997, Inf. Process. Manag..

[14]  Marti A. Hearst,et al.  Finding the flow in web site search , 2002, CACM.

[15]  Oliver Günther,et al.  Multidimensional access methods , 1998, CSUR.

[16]  Tony T. Lee,et al.  An Infornation-Theoretic Analysis of Relational Databases—Part I: Data Dependencies and Information Metric , 1987, IEEE Transactions on Software Engineering.

[17]  Tak W. Yan,et al.  Integrating a Structured-Text Retrieval System with an Object-Oriented Database System , 1994, VLDB.

[18]  Robert M. Losee A Gray code based ordering for documents on shelves: Classification for browsing and retrieval , 1992 .

[19]  Marie-Francine Moens,et al.  Automatic Indexing and Abstracting of Document Texts , 2000, Computational Linguistics.

[20]  Robert M. Losee Text retrieval and filtering: analytic models of performance , 1998 .

[21]  Richard W. Hamming,et al.  Coding and Information Theory , 1980 .

[22]  Robert M. Losee The Relative Shelf Location of Circulated Books: A Study of Classification, Users, and Browsing. , 1993 .

[23]  Jeffrey D. Ullman,et al.  Principles of Database Systems , 1980 .

[24]  Robert M. Losee,et al.  Adaptive Organization of Tabular Data for Display , 2003, J. Digit. Inf..

[25]  Jean Tague-Sutcliffe,et al.  From text to hypertext by indexing , 1995, TOIS.

[26]  Christos Faloutsos,et al.  Gray Codes for Partial Match and Range Queries , 1988, IEEE Trans. Software Eng..

[27]  Gerald Kowalski,et al.  Information Retrieval Systems: Theory and Implementation , 1997 .

[28]  Robert M. Losee,et al.  A Discipline Independent Definition of Information , 1997, J. Am. Soc. Inf. Sci..

[29]  B. C. Walsh,et al.  Online text retrieval via browsing , 1988, Inf. Process. Manag..

[30]  Hosagrahar V. Jagadish,et al.  Proceedings of the 1990 ACM SIGMOD International Conference on Management of Data, Atlantic City, NJ, May 23-25, 1990. , 1990, SIGMOD 1990.

[31]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[32]  Gerhard Weikum,et al.  Intelligent Search on XML Data: Applications, Languages, Models, Implementations, and Benchmarks , 2003 .

[33]  Tieng K. Yap,et al.  Integrating information retrieval techniques with traditional DB methods in a Web-based database browser , 1998, SAC '98.

[34]  David D. Lewis,et al.  Naive (Bayes) at Forty: The Independence Assumption in Information Retrieval , 1998, ECML.

[35]  David A. Hull Improving text retrieval for the routing problem using latent semantic indexing , 1994, SIGIR '94.

[36]  Philip M. Morse ON BROWSING: THE USE OF SEARCH THEORY IN THE SEARCH FOR INFORMATION , 1970 .

[37]  David G. Stork,et al.  Pattern Classification , 1973 .

[38]  Sunita Sarawagi,et al.  Automatic segmentation of text into structured records , 2001, SIGMOD '01.