Iterative Search for Similar Documents on Mobile Devices

This paper presents a new method for searching documents which have similar topics to an already present document set. It is designed to help mobile device users to search for documents in a peer-to-peer environment which have similar topic to the ones on the users own device. The algorithms are designed for slower processors, smaller memory and small data traffic between the devices. These features allow the application in an environment of mobile devices like phones or PDA-s. The keyword list based topic comparison is enhanced with cascading, leading to a series of document searching elements specialized on documents not selected by previous stages. The architecture, the employed algorithms, and the experimental results are presented in this paper.

[1]  Ken Lang,et al.  NewsWeeder: Learning to Filter Netnews , 1995, ICML.

[2]  Lior Rokach,et al.  Data Mining And Knowledge Discovery Handbook , 2005 .

[3]  Ronen Feldman,et al.  The Data Mining and Knowledge Discovery Handbook , 2005 .

[4]  Wei-Ying Ma,et al.  OCFS: optimal orthogonal centroid feature selection for text categorization , 2005, SIGIR '05.

[5]  Dunja Mladenic,et al.  Semi-automatic Construction of Topic Ontologies , 2005, EWMF/KDO.

[6]  R. Tibshirani,et al.  Least angle regression , 2004, math/0406456.

[7]  Tong Zhang,et al.  Text Mining: Predictive Methods for Analyzing Unstructured Information , 2004 .

[8]  Hassan Charaf,et al.  Peer-to-Peer Information Retrieval Based on Fields of Interest , 2007 .

[9]  S. Sathiya Keerthi,et al.  Generalized LARS as an effective feature selection tool for text classification with SVMs , 2005, ICML.

[10]  Wray Buntine,et al.  Topic-specific scoring of documents for relevant retrieval , 2005, ICML 2005.

[11]  Deng Cai,et al.  Orthogonal locality preserving indexing , 2005, SIGIR '05.

[12]  Christiane Fellbaum,et al.  Book Reviews: WordNet: An Electronic Lexical Database , 1999, CL.

[13]  Richard A. Harshman,et al.  Information retrieval using a singular value decomposition model of latent semantic structure , 1988, SIGIR '88.

[14]  D. Mladení,et al.  Semi-automatic construction of topic ontology , 2005 .

[15]  Hassan Charaf,et al.  Neighbor selection in peer-to-peer networks using semantic relations , 2005, ICSE 2005.