Methodologies for Trend Detection in Textual Data Mining

We present two methodologies for the detection of emerging trends in the area of textual data mining. These manual methods are intended to help us improve the performance of our existing fully automatic trend detection system [3]. The first methodology uses citations traces with pruning metrics to generate a document set for an emerging trend. Following this, threshold values are tested to determine the year that the trend emerges. The second methodology uses web resources to identify incipient emerging trends. We demonstrate with a confidence level of 99% that our second approach results in a significant improvement in the precision of trend detection. Lastly we propose the integration of these methods for both the improvement of our existing fully automatic approach as well as in the deployment of our semi-automated CIMEL [20] prototype that employs emerging trends detection to enhance multimedia-based Computer Science education.

[1]  W. Klein,et al.  Bibliometrics , 2005, Social work in health care.

[2]  Steven L. Salzberg,et al.  On growing better decision trees from data , 1996 .

[3]  M. Shaw,et al.  Induction of fuzzy decision trees , 1995 .

[4]  Pak Chung Wong,et al.  Visualizing sequential patterns for text mining , 2000, IEEE Symposium on Information Visualization 2000. INFOVIS 2000. Proceedings.

[5]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[6]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[7]  Edward A. Fox,et al.  Visualizing search results: some alternatives to query-document similarity , 1996, SIGIR '96.

[8]  Cezary Z. Janikow,et al.  Exemplar learning in fuzzy decision trees , 1996, Proceedings of IEEE 5th International Fuzzy Systems.

[9]  Qiang Wang,et al.  Design and Evaluation of Multimedia to Teach Java and Object-Oriented Software Engineering * , 2002 .

[10]  Lucy T. Nowell,et al.  ThemeRiver*: In Search of Trends, Patterns, and Relationships , 1999 .

[11]  William M. Pottenger,et al.  Detecting emerging concepts in textual data mining , 2001 .

[12]  Alberto Maria Segre,et al.  Programs for Machine Learning , 1994 .

[13]  Yiming Yang,et al.  Learning approaches for detecting and tracking news events , 1999, IEEE Intell. Syst..

[14]  F. D. Bouskila The Role of Semantic Locality in Hierarchical Distributed Dynamic Indexing and Information Retrieval , 1999 .

[15]  David Jensen,et al.  TimeMines: Constructing Timelines with Statistical Models of Word Usage , 2000, KDD 2000.

[16]  Robert L. Grossman,et al.  Data Mining for Scientific and Engineering Applications , 2001, Massive Computing.

[17]  Alan L. Porter,et al.  Technology opportunities analysis , 1995 .

[18]  D J PRICE,et al.  NETWORKS OF SCIENTIFIC PAPERS. , 1965, Science.