Mining user tasks from print logs

With lots of applications emerging in World Wide Web, many interaction data from users are collected and exploited to discover user behavior or interest patterns. In this paper, we attempt to exploit a new interaction data, namely print logs, where each record is printing URLs selected by a user using a popular web printing tool. Users usually print web contents based on an intention (subtask or task). Apparently, mining common print tasks from print logs is able to capture users' intentions, which undoubtedly benefits many web applications, such as task oriented recommendation and behavior targeting. However, it is not an easy job to perform this due to the difficulty of URL topic representation and task formulation. To this end, we propose a general framework, named UPT (Users Print Tasks mining framework), for mining print tasks from print logs. Specifically, we attempt to leverage delicious (a social book marking web service) as an external thesaurus to expand the expression of each URL by selecting tags associated with the domain of each URL. Then, we construct a tag co-occurrence graph where similar tags can be clustered as subtasks. If we view each subtask as an item, then the print log is transformed to a transaction database, on which an efficient pattern mining algorithm is proposed to induce tasks. Finally, we evaluate the effectiveness of the proposed framework through experiments on a real print log.

[1]  Johannes Gehrke,et al.  MAFIA: a maximal frequent itemset algorithm for transactional databases , 2001, Proceedings 17th International Conference on Data Engineering.

[2]  Ji-Rong Wen,et al.  Clustering user queries of a search engine , 2001, WWW '01.

[3]  Doug Beeferman,et al.  Agglomerative clustering of a search engine query log , 2000, KDD '00.

[4]  David Lo,et al.  Inferring semantically related software terms and their taxonomy by leveraging collaborative tagging , 2012, 2012 28th IEEE International Conference on Software Maintenance (ICSM).

[5]  Yueting Zhuang,et al.  Tag Clustering and Refinement on Semantic Unity Graph , 2011, 2011 IEEE 11th International Conference on Data Mining.

[6]  Chao Liu,et al.  Efficient mining of iterative patterns for software specification discovery , 2007, KDD '07.

[7]  Francesco Bonchi,et al.  Do you want to take notes?: identifying research missions in Yahoo! search pad , 2010, WWW '10.

[8]  Harald Kosch,et al.  Tag Relatedness Using Laplacian Score Feature Selection and Adapted Jensen-Shannon Divergence , 2014, MMM.

[9]  Yiannis Kompatsiaris,et al.  A Graph-Based Clustering Scheme for Identifying Related Tags in Folksonomies , 2010, DaWak.

[10]  Flavius Frasincar,et al.  Improving the Exploration of Tag Spaces Using Automated Tag Clustering , 2011, ICWE.

[11]  William W. Cohen,et al.  Node Clustering in Graphs: An Empirical Study , 2010 .

[12]  Yanchun Zhang,et al.  SemRec: A Semantic Enhancement Framework for Tag Based Recommendation , 2011, AAAI.

[13]  Mitsunori Ogihara,et al.  Potential Relationship Discovery in Tag-Aware Music Style Clustering and Artist Social Networks , 2011, ISMIR.

[14]  Jianhua Lin,et al.  Divergence measures based on the Shannon entropy , 1991, IEEE Trans. Inf. Theory.

[15]  Ramakrishnan Srikant,et al.  Fast Algorithms for Mining Association Rules in Large Databases , 1994, VLDB.

[16]  Enhong Chen,et al.  Context-aware query suggestion by mining click-through and session data , 2008, KDD.

[17]  Ping Luo,et al.  Incorporating occupancy into frequent pattern mining for high quality pattern recommendation , 2012, CIKM.

[18]  Dan Morris,et al.  SearchBar: a search-centric web history for task resumption and information re-finding , 2008, CHI.

[19]  Ricardo A. Baeza-Yates,et al.  Query Recommendation Using Query Logs in Search Engines , 2004, EDBT Workshops.

[20]  Enhong Chen,et al.  Mining Frequent Patterns in Print Logs with Semantically Alternative Labels , 2013, ADMA.

[21]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD '00.

[22]  Krishna Bharat SearchPad: explicit capture of search context to support Web search , 2000, Comput. Networks.

[23]  Michael R. Lyu,et al.  UserRec: A User Recommendation Framework in Social Tagging Systems , 2010, AAAI.