Summarizing personal web browsing sessions

We describe a system, implemented as a browser extension, that enables users to quickly and easily collect, view, and share personal Web content. Our system employs a novel interaction model, which allows a user to specify webpage extraction patterns by interactively selecting webpage elements and applying these patterns to automatically collect similar content. Further, we present a technique for creating visual summaries of the collected information by combining user labeling with predefined layout templates. These summaries are interactive in nature: depending on the behaviors encoded in their templates, they may respond to mouse events, in addition to providing a visual summary. Finally, the summaries can be saved or sent to others to continue the research at another place or time. Informal evaluation shows that our approach works well for popular websites, and that users can quickly learn this interaction model for collecting content from the Web.

[1]  Bing Liu,et al.  Web data extraction based on partial tree alignment , 2005, WWW '05.

[2]  Susan T. Dumais,et al.  Keeping and re-finding information on the web: What do people do and what do they need? , 2005, ASIST.

[3]  Steven K. Feiner,et al.  A Survey of Automated Layout Techniques for Information Presentations , 2005 .

[4]  David R. Karger,et al.  Piggy Bank: Experience the Semantic Web inside your web browser , 2005, J. Web Semant..

[5]  Peter J. Stuckey,et al.  Constraint cascading style sheets for the Web , 1999, UIST '99.

[6]  Abigail Sellen,et al.  How knowledge workers use the web , 2002, CHI.

[7]  George G. Robertson,et al.  The WebBook and the Web Forager: an information workspace for the World-Wide Web , 1996, CHI.

[8]  Monica M. C. Schraefel,et al.  Hunter gatherer: interaction support for the creation and management of within-web-page collections , 2002, WWW.

[9]  David Salesin,et al.  Adaptive grid-based document layout , 2003, ACM Trans. Graph..

[10]  David R. Karger,et al.  Thresher: automating the unwrapping of semantic content from the World Wide Web , 2005, WWW '05.

[11]  Deborah Hix,et al.  Experiments in social data mining: The TopicShop system , 2003, TCHI.

[12]  Atsushi Sugiura,et al.  Internet scrapbook: automating Web browsing tasks by demonstration , 1998, UIST '98.

[13]  Rob Miller,et al.  Automation and customization of rendered web pages , 2005, UIST.

[14]  Yuzuru Tanaka,et al.  Clip, connect, clone: combining application elements to build custom interfaces for information access , 2004, UIST '04.

[15]  Brad A. Myers,et al.  Citrine: providing intelligent copy-and-paste , 2004, UIST '04.

[16]  Mary Czerwinski,et al.  Data mountain: using spatial memory for document management , 1998, UIST '98.

[17]  Andrew Tomkins,et al.  The volume and evolution of web page templates , 2005, WWW '05.