Personalized Information Environments: An Architecture for Customizable Access to Distributed Digital Libraries

We describe the conceptual architecture of a Personalized Information Environment or PIE A PIE allows uni ed highly customizable access to distributed information resources by providing users the tools to compose personalized collections from a palette of information resources The architecture also provides for the e cient exchange of inter resource meta information like collection statistics in order to maximize retrieval e ectiveness This paper includes the enunciation of the user centered PIE vision an architectural requirements speci cation and an architectural description that meets the speci cation and supports the vision We also describe our current implementation and research e orts conducted within the PIE framework In Vannevar Bush rst described an information environment that he felt was bordering on the unmanageable with information users awash in research results and scholarly communication and few usable mechanisms to organize them Bush felt that current mechanisms for dealing with information were wholly inadequate given the volume of work being produced Professionally our methods of transmitting and reviewing the results of research are generations old and by now are totally inadequate for their purpose Those who conscientiously attempt to keep abreast of current thought even in restricted elds by close and continuous reading might well shy away from an examination calculated to show how much of the previous month s e orts could be produced on call p From this line of thought Bush then developed his vision of the Memex a tool that would allow it s user to note bookmark and otherwise organize information in whatever fashion made most sense to that user corresponding author Visionary in its scope the Memex continues to give motivation and direction to a large amount of research in information storage and retrieval It is abundantly clear that today the Memex might help immeasurably in stemming the increasing tide of information While mechanisms to generate information continue to grow in number and sophistication tools and techniques to manage lter and search lag behind The level of care taken in the preparation of information for online publi cation varies greatly Access to the information is often poor Even awareness of the existence of speci c data is becoming increasingly di cult Organizational strategies provided by information publishers are publisher centric or designed to meet the needs of a speci c user group What is needed are tools that will enable users to create personal collections of information resources of interest to them It will be necessary to cull tens of thousands of resources for those of speci c interest it will also be necessary to continuously monitor available resources to detect new useful sources or to decide that others are no longer of interest E cient search strategies are required to support the discovery of resources and to search and fuse information gleaned from those resources In this paper we present a vision of a user centered user organized information space called a Personalized Information Environment or PIE In contrast to a typical Internet search of multiple information resources where control of which resources are searched is in the search engine s hands a PIE places the control in the user s hands In the PIE formulation descriptions of resources are made available to users and they decide which resources to include in a search The process of resource selection is highly interactive and might involve sample searches and then selection or de selection of resources from the user s current personalized collection Regardless of the degree of interactivity e cient and e ective search is provided within whatever context the current collection of resources de nes Since users may spend considerable e ort customizing their personal resource collection it makes sense to allow sharing of the collection in constrained ways or using pre de ned policies while maintaining whatever privacy or security constraints might be placed on particular resources or users Thus there are four driving principles behind the PIE

[1]  Luis Gravano,et al.  STARTS: Stanford proposal for Internet meta-searching , 1997, SIGMOD '97.

[2]  Kevin Chen-Chuan Chang,et al.  Using Distributed Objects to Build the Stanford Digital Library Infobus , 1999, Computer.

[3]  Ellen M. Voorhees,et al.  Learning collection fusion strategies , 1995, SIGIR '95.

[4]  Luis Gravano,et al.  dSCAM: finding document copies across multiple databases , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[5]  Bruce R. Schatz,et al.  Building the interspace: the Illinois Digital Library Project , 1995, CACM.

[6]  James C. French,et al.  Ensuring Retrieval Effectiveness in Distributed Digital Libraries , 1996, J. Vis. Commun. Image Represent..

[7]  Luis Gravano,et al.  The Effectiveness of GlOSS for the Text Database Discovery Problem , 1994, SIGMOD Conference.

[8]  James C. French,et al.  Comparing the performance of database selection algorithms , 1999, SIGIR '99.

[9]  James P. Callan,et al.  Automatic discovery of language models for text databases , 1999, SIGMOD '99.

[10]  Ronald L. Larsen Relaxing Assumptions... Stretching the Vision: A Modest View of Some Technical Issues , 1997, D Lib Mag..

[11]  James C. French,et al.  Dissemination of collection wide information in a distributed information retrieval system , 1995, SIGIR '95.

[12]  Andrew S. Grimshaw,et al.  The Legion vision of a worldwide virtual computer , 1997, Commun. ACM.

[13]  Ellen M. Voorhees The TREC-5 Database Merging Track , 1996, TREC.

[14]  Andreas Paepcke,et al.  Using Distributed Objects for Digital Library Interoperability , 1996, Computer.

[15]  James C. French,et al.  Evaluating database selection techniques: a testbed and experiment , 1998, SIGIR '98.

[16]  William A. Wulf,et al.  A new model of security for distributed systems , 1996, NSPW '96.

[17]  Luis Gravano,et al.  GlOSS: text-source discovery over the Internet , 1999, TODS.

[18]  Bert J. Dempsey,et al.  An interactive WWW search engine for user-defined collections , 1998, DL '98.

[19]  Nicholas J. Belkin,et al.  Information filtering and information retrieval: two sides of the same coin? , 1992, CACM.

[20]  James C. French,et al.  On the update of term weights in dynamic information retrieval systems , 1995, CIKM '95.

[21]  James P. Callan,et al.  Document filtering with inference networks , 1996, SIGIR '96.

[22]  Vannevar Bush,et al.  As we may think , 1945, INTR.

[23]  J. C. French DIRE: an approach to improving informal scientific communication , 1994 .

[24]  W. Bruce Croft,et al.  Searching distributed collections with inference networks , 1995, SIGIR '95.

[25]  James C. French,et al.  Efficient searching in distributed digital libraries , 1998, DL '98.