论文信息 - Memex: A Browsing Assistant for Collaborative Archiving and Mining of Surf Trails

Memex: A Browsing Assistant for Collaborative Archiving and Mining of Surf Trails

Keyword indices, topic directories, and link-based rankings are used to search and structure the rapidly growing Web today. Surprisingly little use is made of years of browsing experience of millions of people. Indeed, this information is routinely discarded by browsers. Even deliberate bookmarks are stored in a passive and isolated manner. All this goes against Vannevar Bush’s dream of the Memex : an enhanced supplement to personal and community memory. We propose to demonstrate the beginnings of a ‘Memex’ for the Web: a browsing assistant for individuals and groups with focused interests. Memex blurs the articial distinction between browsing history and deliberate bookmarks. The resulting glut of data is analyzed in a number of ways at the individual and community levels. Memex constructs a topic directory customized to the community, mapping their interests naturally to nodes in this directory. This lets the user recall topic-based browsing contexts by asking questions like \What trails was I following when I was last surng about classical music?" and \What are some popular pages in or near my community’s recent trail graph related to music?" 1 Motivation Three paradigms have emerged for exploring the Web: keyword search, directory browsing, and following links. Popular search engine and directory sites are visited tens of millions of times per day. We speculate that the total number of clicks per day is orders of magnitude larger. This third source of information, the browsing history of millions of Web users over several years, an information source that dwarfs the scale of the Web itself, is almost entirely discarded by browsers as ‘history’. Deliberate ‘bookmarks’ are preserved, but passively, in browser-dependent formats; this separates them from the dominant world of HTML hypermedia, even if their owners were willing to share them (as they are, in our experience, with all but a small section of their browsing activity).

Mitul Tiwari | Soumen Chakrabarti | Sandeep Srivastava | Mallela Subramanyam

[1] Yoelle Maarek,et al. The Shark-Search Algorithm. An Application: Tailored Web Site Mapping , 1998, Comput. Networks.

[2] Paul P. Maglio,et al. Metaphors We Surf the Web By , 2022 .

[3] David R. Karger,et al. Constant interaction-time scatter/gather browsing of very large document collections , 1993, SIGIR.

[4] Martin van den Berg,et al. Focused Crawling: A New Approach to Topic-Specific Web Resource Discovery , 1999, Comput. Networks.

[5] Israel Ben-Shaul,et al. Automatically Organizing Bookmarks per Contents , 1996, Comput. Networks.

[6] Israel Ben-Shaul,et al. Adding Support for Dynamic and Focused Search with Fetuccino , 1999, Comput. Networks.

[7] Dean P. Foster,et al. Clustering Methods for Collaborative Filtering , 1998, AAAI 1998.

[8] Prabhakar Raghavan,et al. Scalable feature selection, classification and signature generation for organizing large text databases into hierarchical topic taxonomies , 1998, The VLDB Journal.