Mining Research Topics Evolving Over Time Using a Diachronic Multi-source Approach

The acquisition of new scientific knowledge and the evolution of the needs of the society regularly call into question the orientations of research. Means to recall and visualize these evolutions are thus necessary. The existing tools for research survey give only one fixed vision of the research activity, which does not allow performing tasks of dynamic topic mining. The objective of this paper is thus to propose a new incremental approach in order to follow the evolution of research themes and research groups for a scientific discipline given in terms of emergence or decline. These behaviors are detectable by various methods of filtering. However, our choice is made on the exploitation of neural clustering methods in a multi-view context. This new approach makes it possible to take into account the incremental and chronological aspect of information by opening the way to the detection of convergences and divergences of research themes and groups.

[1]  Alain Lelu,et al.  Suivi incrémental des évolutions dans une base d'information indexée : une boucle évaluation / correction pour le choix des algorithmes et des paramètres. , 2009 .

[2]  Alain Lelu,et al.  Mesures de qualité de clustering de documents : Prise en compte de la distribution des mots clés , 2010 .

[3]  Donald W. Bouldin,et al.  A Cluster Separation Measure , 1979, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[4]  Yiming Yang,et al.  Topic Detection and Tracking Pilot Study Final Report , 1998 .

[5]  Jean-Charles Lamirel,et al.  A New Incremental Growing Neural Gas Algorithm Based on Clusters Labeling Maximization: Application to Clustering of Heterogeneous Textual Data , 2010, IEA/AIE.

[6]  J.-C. Lamirel,et al.  MultiSOM: A Multiview Neural Model for Accurately Analyzing and Mining Complex Data , 2006, Fourth International Conference on Coordinated & Multiple Views in Exploratory Visualization (CMV'06).

[7]  Jean-Charles Lamirel,et al.  Clustering Analysis for Data Samples with Multiple Labels , 2006, Databases and Applications.

[8]  Shonali Krishnaswamy,et al.  Mining data streams: a review , 2005, SGMD.

[9]  Stephen E. Robertson,et al.  Relevance weighting of search terms , 1976, J. Am. Soc. Inf. Sci..

[10]  Andreas Rauber,et al.  Adaptive Hierarchical Incremental Grid Growing: An architecture for high-dimensional data visualization , 2003 .

[11]  A. Ennaji,et al.  An incremental growing neural gas learns topologies , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[12]  Teuvo Kohonen,et al.  Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.

[13]  Bernd Fritzke,et al.  A Growing Neural Gas Network Learns Topologies , 1994, NIPS.

[14]  Raymond Trémolières The percolation method for an efficient grouping of data , 1979, Pattern Recognit..

[15]  Claire François,et al.  An advanced diffusion model to identify emergent research issues: the case of optoelectronic devices , 2010, Scientometrics.

[16]  Shadi Al Shehabi,et al.  Multi-topographic neural network communication and generalization for multi-viewpoint analysis , 2005, Proceedings. 2005 IEEE International Joint Conference on Neural Networks, 2005..

[17]  J. MacQueen Some methods for classification and analysis of multivariate observations , 1967 .

[18]  Jean-Charles Lamirel Evaluation of collaboration between European universities using dynamic interaction between multiple sources , 2005 .

[19]  Vladimir Batagelj,et al.  An O(m) Algorithm for Cores Decomposition of Networks , 2003, ArXiv.

[20]  S.A. Shehabi,et al.  Inference Bayesian network for multitopographic neural network communication: a case study in documentary data , 2004, Proceedings. 2004 International Conference on Information and Communication Technologies: From Theory to Applications, 2004..

[21]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[22]  Jean-Charles Lamirel,et al.  New classification quality estimators for analysis of documentary information: Application to patent analysis and web mapping , 2004, Scientometrics.

[23]  Ellen M. Voorhees,et al.  Implementing agglomerative hierarchic clustering algorithms for use in document retrieval , 1986, Inf. Process. Manag..

[24]  Jean-Charles Lamirel,et al.  Application of a symbolic-connectionist approach for the design of a highly interactive documentary database interrogation system with on-line learning capabilities , 1994, CIKM '94.

[25]  Matthieu Latapy,et al.  Computing Communities in Large Networks Using Random Walks , 2004, J. Graph Algorithms Appl..

[26]  T. Caliński,et al.  A dendrite method for cluster analysis , 1974 .

[27]  Bidyut Baran Chaudhuri,et al.  Incremental classification of invoice documents , 2008, 2008 19th International Conference on Pattern Recognition.

[28]  Jean-Charles Lamirel,et al.  Feature-based cluster validation for high-dimensional data , 2008 .

[29]  Nathalie Mitton,et al.  Self-organization in large scale ad hoc networks , 2004 .

[30]  Jean-Charles Lamirel,et al.  Novel labeling strategies for hierarchical representation of multidimensional data analysis results , 2008 .

[31]  Claire François,et al.  Stanalyst® : Une station pour l'analyse de l'information , 2005 .

[32]  Hans-Peter Kriegel,et al.  A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise , 1996, KDD.