Topic comparison of remote documents using small communication traffic

This paper presents a new method for semantic search solutions designed for mobile device environments. The proposed system aims at helping users by searching for documents which have similar topics to the ones stored on the users own device. The search is performed in background continuously and the user is notified if documents worth for downloading were found. The methods proposed in this paper aim at solving this task while maintaining low communication traffic to make them applicable in the mobile device environment.

[1]  Miles Efron,et al.  Query expansion and dimensionality reduction: Notions of optimality in Rocchio relevance feedback and latent semantic indexing , 2008, Inf. Process. Manag..

[2]  Hassan Charaf,et al.  USING CONCEPT RELATIONSHIPS TO IMPROVE DOCUMENT CATEGORIZATION , 2004 .

[3]  Shyi-Ming Chen,et al.  Query expansion for document retrieval based on fuzzy rules and user relevance feedback techniques , 2006, Expert Syst. Appl..

[4]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[5]  Naftali Tishby,et al.  Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.

[6]  Yiming Yang,et al.  RCV1: A New Benchmark Collection for Text Categorization Research , 2004, J. Mach. Learn. Res..

[7]  Steffen Staab,et al.  Learning Concept Hierarchies from Text Corpora using Formal Concept Analysis , 2005, J. Artif. Intell. Res..

[8]  Wolfgang Nejdl,et al.  Summarizing local context to personalize global web search , 2006, CIKM '06.

[9]  Ali Selamat,et al.  Web page feature selection and classification using neural networks , 2004, Inf. Sci..

[10]  Wei-Ying Ma,et al.  OCFS: optimal orthogonal centroid feature selection for text categorization , 2005, SIGIR '05.

[11]  Dunja Mladenic,et al.  Semi-automatic Construction of Topic Ontologies , 2005, EWMF/KDO.

[12]  Chris Buckley,et al.  OHSUMED: an interactive retrieval evaluation and new large test collection for research , 1994, SIGIR '94.

[13]  C. R.O. Query Expansion for Document Retrieval by Mining Additional Query Terms , 2008 .

[14]  Ilyas Cicekli,et al.  Using lexical chains for keyword extraction , 2007, Inf. Process. Manag..

[15]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[16]  Tsvi Kuflik,et al.  Automatic keyword identification by artificial neural networks compared to manual identification by users of filtering systems , 2001, Inf. Process. Manag..

[17]  Wray Buntine,et al.  Topic-specific scoring of documents for relevant retrieval , 2005, ICML 2005.

[18]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[19]  Kristof Csorba,et al.  Transformations and Selection Methods in Document Clustering , 2009 .