PubSearch: a Web citation‐based retrieval system

Many scientific publications are now available on the World Wide Web for researchers to share research findings. However, they tend to be poorly organised, making the search of relevant publications difficult and time‐consuming. Most existing search engines are ineffective in searching these publications, as they do not index Web publications that normally appear in PDF (portable document format) or PostScript formats. Proposes a Web citation‐based retrieval system, known as PubSearch, for the retrieval of Web publications. PubSearch indexes Web publications based on citation indices and stores them into a Web Citation Database. The Web Citation Database is then mined to support publication retrieval. Apart from supporting the traditional cited reference search, PubSearch also provides document clustering search and author clustering search. Document clustering groups related publications into clusters, while author clustering categorizes authors into different research areas based on author co‐citation analysis.

[1]  Stephen Grossberg,et al.  Fuzzy ART: Fast stable learning and categorization of analog patterns by an adaptive resonance system , 1991, Neural Networks.

[2]  Chaomei Chen,et al.  Visualising Semantic Spaces and Author Co-Citation Networks in Digital Libraries , 1999, Inf. Process. Manag..

[3]  Jeff White Readings in agents , 1998 .

[4]  Anil K. Jain,et al.  Data clustering: a review , 1999, CSUR.

[5]  Francis T. Durso,et al.  Network Structures in Proximity Data , 1989 .

[6]  Howard D. White,et al.  Author cocitation: A literature measure of intellectual structure , 1981, J. Am. Soc. Inf. Sci..

[7]  C. Lee Giles,et al.  CiteSeer: an autonomous Web agent for automatic retrieval and identification of interesting publications , 1998, AGENTS '98.

[8]  Teuvo Kohonen,et al.  Self-Organizing Maps , 2010 .

[9]  Blaise Cronin,et al.  Comparative citation rankings of authors in monographic and journal literature: a study of sociology , 1997, J. Documentation.

[10]  Samuel Kaski,et al.  Dimensionality reduction by random mapping: fast similarity computation for clustering , 1998, 1998 IEEE International Joint Conference on Neural Networks Proceedings. IEEE World Congress on Computational Intelligence (Cat. No.98CH36227).

[11]  Yulan He,et al.  Mining a web citation database for the retrieval of scientific publications over the www. , 2000 .

[12]  Matthew J. Zagumny,et al.  The SPSS® Book: A Student Guide to the Statistical Package for the Social Sciences® , 2001 .

[13]  C. Lee Giles,et al.  Discovering Relevant Scientific Literature on the Web , 2000, IEEE Intell. Syst..

[14]  T. Landauer,et al.  Indexing by Latent Semantic Analysis , 1990 .

[15]  G Salton,et al.  Developments in Automatic Text Retrieval , 1991, Science.

[16]  Michael McGill,et al.  Introduction to Modern Information Retrieval , 1983 .

[17]  Oren Etzioni,et al.  Web document clustering: a feasibility demonstration , 1998, SIGIR '98.

[18]  Timo Honkela,et al.  Self-Organizing Maps of Very Large Document Collections: Justification for the WEBSOM Method , 1998 .

[19]  Xia Lin Map displays for information retrieval , 1997 .

[20]  George A. Miller,et al.  WordNet: A Lexical Database for English , 1995, HLT.

[21]  Second Edition,et al.  Statistical Package for the Social Sciences , 1970 .

[22]  Brian Everitt,et al.  Cluster analysis , 1974 .

[23]  B. Everitt,et al.  Cluster Analysis (2nd ed). , 1982 .

[24]  Paul E. Green,et al.  Multidimensional Scaling: Concepts and Applications , 1989 .

[25]  Andreas Rauber,et al.  SOMLib: a digital library system based on neural networks , 1999, DL '99.

[26]  Samuel Kaski,et al.  Self organization of a massive document collection , 2000, IEEE Trans. Neural Networks Learn. Syst..