Extending Information Retrieval by Adjusting Text Feature Vectors

Automatic detection of text scope is now crucial for information retrieval tasks owing to semantic, linguistic, and unexpressive content problems, which has increased the demand for uncomplicated, language-independent, and scope-based strategies. In this paper, we extend the vector of documents with exerting impressive words to simplify expressiveness of each document from extracted essential words of related documents and then analyze the network of these words to detect words that share meaningful concepts related to exactly our document. In other words, we analyze each document in only one topic: the topic of that document. We changed measures of social network analysis according to weights of the document words. The impression of these new words to the document can be exerted as changing the document vector weights or inserting these words as metadata to the document. As an example, we classified documents and compared effectiveness of our Intelligent Information Retrieval (IIR) model.

[1]  Ellen M. Voorhees On test collections for adaptive information retrieval , 2008, Inf. Process. Manag..

[2]  Nick Koudas,et al.  Efficient sampling of information in social networks , 2008, SSM '08.

[3]  Ziv Bar-Yossef,et al.  Random sampling from a search engine's index , 2006, WWW '06.

[4]  Steve Chien,et al.  Approximating Aggregate Queries about Web Pages via Random Walks , 2000, VLDB.

[5]  Gerard Salton,et al.  A vector space model for automatic indexing , 1975, CACM.

[6]  Krishna P. Gummadi,et al.  Exploiting Social Networks for Internet Search , 2006, HotNets.

[7]  Susan T. Dumais,et al.  Personalizing Search via Automated Analysis of Interests and Activities , 2005, SIGIR.

[8]  Amanda Spink,et al.  Evaluating Usability of a Long Query Meta Search Engine , 2007, 2007 40th Annual Hawaii International Conference on System Sciences (HICSS'07).

[9]  Andrzej Bargiela,et al.  Semantic-Enhanced Information Search and Retrieval , 2007, Sixth International Conference on Advanced Language Processing and Web Information Technology (ALPIT 2007).

[10]  Poonia Taheri Makhsoos,et al.  Improving Feature Vector by Words' Position and Sequence for Text Classification , 2009 .

[11]  Christos Gkantsidis,et al.  Random walks in peer-to-peer networks: Algorithms and evaluation , 2006, Perform. Evaluation.

[12]  Stan Szpakowicz,et al.  Roget's thesaurus and semantic similarity , 2012, RANLP.

[13]  Peter W. Foltz,et al.  An introduction to latent semantic analysis , 1998 .

[14]  Ted Pedersen,et al.  WordNet::Similarity - Measuring the Relatedness of Concepts , 2004, NAACL.

[15]  Stanley Wasserman,et al.  Social Network Analysis: Methods and Applications , 1994, Structural analysis in the social sciences.

[16]  D. Aldous On the Markov Chain Simulation Method for Uniform Combinatorial Distributions and Simulated Annealing , 1987, Probability in the Engineering and Informational Sciences.

[17]  Hongyuan Zha,et al.  Exploring social annotations for information retrieval , 2008, WWW.