Navigation through Citation Network Based on Content Similarity Using Cosine Similarity Algorithm

The rate of scientific literature has been increased in the past few decades; new topics and information is added in the form of articles, papers, text documents, web logs, and patents. The growth of information at rapid rate caused a tremendous amount of additions in the current and past knowledge, during this process, new topics emerged, some topics split into many other sub-topics, on the other hand, many topics merge to formed single topic. The selection and search of a topic manually in such a huge amount of information have been found as an expensive and workforce-intensive task. For the emerging need of an automatic process to locate, organize, connect, and make associations among these sources the researchers have proposed different techniques that automatically extract components of the information presented in various formats and organize or structure them. The targeted data which is going to be processed for component extraction might be in the form of text, video or audio. The addition of different algorithms has structured information and grouped similar information into clusters and on the basis of their importance, weighted them. The organized, structured and weighted data is then compared with other structures to find similarity with the use of various algorithms. The semantic patterns can be found by employing visualization techniques that show similarity or relation between topics over time or related to a specific event. In this paper, we have proposed a model based on Cosine Similarity Algorithm for citation network which will answer the questions like, how to connect documents with the help of citation and content similarity and how to visualize and navigate through the document.

[1]  Qiang Wang,et al.  Analysis of Topic Evolution Based on Subtopic Similarity , 2009, 2009 International Conference on Computational Intelligence and Natural Computing.

[2]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[3]  ChengXiang Zhai,et al.  Discovering evolutionary theme patterns from text: an exploration of temporal text mining , 2005, KDD '05.

[4]  Xin Tong,et al.  TextFlow: Towards Better Understanding of Evolving Topics in Text , 2011, IEEE Transactions on Visualization and Computer Graphics.

[5]  Dan Zhang,et al.  Topic detection based on K-means , 2011, 2011 International Conference on Electronics, Communications and Control (ICECC).

[6]  Abdul Salam Shah,et al.  A model for handling overloading of literature review process for social science , 2015 .

[7]  Michael I. Jordan,et al.  Latent Dirichlet Allocation , 2001, J. Mach. Learn. Res..

[8]  Yueming Lu,et al.  A practical approach to topic detection based on credible association rule mining , 2012, 2012 3rd IEEE International Conference on Network Infrastructure and Digital Content.

[9]  Atsuhiro Takasu,et al.  Extraction of topic evolutions from references in scientific articles and its GPU acceleration , 2012, CIKM.

[10]  Abdul Salam Shah,et al.  A review of slicing techniques in software engineering , 2015 .

[11]  Yan Chen,et al.  A topic detection method based on Semantic Dependency Distance and PLSA , 2012, Proceedings of the 2012 IEEE 16th International Conference on Computer Supported Cooperative Work in Design (CSCWD).

[12]  Christian Wartena,et al.  Topic Detection by Clustering Keywords , 2008, 2008 19th International Workshop on Database and Expert Systems Applications.

[13]  Patrick Pantel,et al.  From Frequency to Meaning: Vector Space Models of Semantics , 2010, J. Artif. Intell. Res..

[14]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[15]  Robert L. Grossman,et al.  KDD-2005 : proceedings of the Eleventh ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, August 21-24, 2005, Chicago, Illinois, USA , 2005 .

[16]  Carl Lagoze,et al.  Detecting research topics via the correlation between graphs and texts , 2007, KDD '07.

[17]  Asadullah Shah,et al.  A Study of Software Protection Techniques , 2007 .

[18]  Craig H. Martell,et al.  Topic Detection and Extraction in Chat , 2008, 2008 IEEE International Conference on Semantic Computing.

[19]  Vikram Pudi,et al.  An Efficient Algorithm for Topic Ranking and Modeling Topic Evolution , 2011, DEXA.

[20]  P. Chitra,et al.  Topic clustering and topic evolution based on temporal parameters , 2012, 2012 International Conference on Recent Trends in Information Technology.

[21]  Abdul Salam Shah,et al.  An appraisal of off-line signature verification techniques , 2015 .

[22]  Tao Wang,et al.  Topic detection based on keyword , 2011, 2011 International Conference on Mechatronic Science, Electric Engineering and Computer (MEC).

[23]  S. Durga Bhavani,et al.  An Efficient Approach in Text Clustering Based on Frequent Itemsets , 2013 .

[24]  Thomas L. Griffiths,et al.  Probabilistic author-topic models for information discovery , 2004, KDD.

[25]  Asadullah Shah,et al.  Detecting changes in context using time series analysis of social network , 2015, 2015 SAI Intelligent Systems Conference (IntelliSys).

[26]  Yaohong Jin A Topic Detection and Tracking Method Combining NLP with Suffix Tree Clustering , 2012, 2012 International Conference on Computer Science and Electronics Engineering.

[27]  Mirella Lapata,et al.  Dependency-Based Construction of Semantic Space Models , 2007, CL.

[28]  Junping Du,et al.  A topic detection approach based on multi-level clustering , 2012, Proceedings of the 31st Chinese Control Conference.

[29]  Hinrich Schütze,et al.  Introduction to information retrieval , 2008 .

[30]  Mark Steyvers,et al.  Finding scientific topics , 2004, Proceedings of the National Academy of Sciences of the United States of America.

[31]  Asadullah Shah,et al.  Alternate Paradigm for Navigating the WWW Through Zoomable User Interface , 2007 .

[32]  Asadullah Shah,et al.  Relation mining using cross correlation of multi domain social networks , 2015, 2015 SAI Intelligent Systems Conference (IntelliSys).

[33]  Carl Lagoze,et al.  The web of topics: discovering the topology of topic evolution in a corpus , 2011, WWW.