Critical and future trends in data mining: a review of key data mining technologies/applications

Every day, enormous amounts of information are generated from all sectors, whether it be business, education, the scientific community, the World Wide Web (WWW) or one of many readily available off-line and online data sources. From all of this, which represents a sizable repository of data and information, it is possible to generate worthwhile and usable knowledge. As a result, the field of Data Mining (DM) and knowledge discovery in databases (KDD) has grown in leaps and bounds and has shown great potential for the future (Han & Kamber, 2001). The purpose of this chapter is to survey many of the critical and future trends in the field of DM, with a focus on those which are thought to have the most promise and applicability to future DM applications.

[1]  Rhonda Delmater,et al.  Data Mining Explained: A Manager's Guide to Customer-Centric Business Intelligence , 2001 .

[2]  Jiawei Han,et al.  Selective Materialization: An Efficient Method for Spatial Data Cube Construction , 1998, PAKDD.

[3]  Deborah L. McGuinness,et al.  Integrated Support for Data Archeology , 1993, Int. J. Cooperative Inf. Syst..

[4]  Jian Pei,et al.  DNA-miner: a system prototype for mining DNA sequences , 2001, SIGMOD '01.

[5]  Jiawei Han,et al.  Mining recurrent items in multimedia with progressive resolution refinement , 2000, Proceedings of 16th International Conference on Data Engineering (Cat. No.00CB37073).

[6]  Sourav S. Bhowmick,et al.  Research Issues in Web Data Mining , 1999, DaWaK.

[7]  Jian Pei,et al.  Can we push more constraints into frequent pattern mining? , 2000, KDD '00.

[8]  Paul Greenberg CRM at the Speed of Light, Third Edition: Essential Customer Strategies for the 21st Century , 2004 .

[9]  Jon Kleinberg,et al.  Authoritative sources in a hyperlinked environment , 1999, SODA '98.

[10]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[11]  Anthony K. H. Tung,et al.  Spatial clustering in the presence of obstacles , 2001, Proceedings 17th International Conference on Data Engineering.

[12]  Anthony K. H. Tung,et al.  Spatial clustering methods in data mining : A survey , 2001 .

[13]  Jiawei Han,et al.  Spatial Data Mining: Progress and Challenges , 1996, Workshop on Research Issues on Data Mining and Knowledge Discovery.

[14]  Paolo Merialdo,et al.  Semistructured and structured data in the Web: going back and forth , 1997, SGMD.

[15]  Oren Etzioni,et al.  A scalable comparison-shopping agent for the World-Wide Web , 1997, AGENTS '97.

[16]  Carl Gutwin,et al.  Domain-Specific Keyphrase Extraction , 1999, IJCAI.

[17]  James E. Pitkow,et al.  Characterizing Browsing Strategies in the World-Wide Web , 1995, Comput. Networks ISDN Syst..

[18]  Ke Wang,et al.  Mining Frequent Itemsets Using Support Constraints , 2000, VLDB.

[19]  Ke Wang,et al.  Pushing Support Constraints Into Association Rules Mining , 2003, IEEE Trans. Knowl. Data Eng..

[20]  Hendrik Blockeel,et al.  Web mining research: a survey , 2000, SKDD.

[21]  Jochen Dörre,et al.  Text mining: finding nuggets in mountains of textual data , 1999, KDD '99.

[22]  Chanathip Namprempre,et al.  HyPursuit: a hierarchical network search engine that exploits content-link hypertext clustering , 1996, HYPERTEXT '96.

[23]  Tom M. Mitchell,et al.  Learning to Extract Symbolic Knowledge from the World Wide Web , 1998, AAAI/IAAI.

[24]  Jon M. Kleinberg,et al.  Mining the Web's Link Structure , 1999, Computer.

[25]  Carey L. Williamson,et al.  Internet Web servers: workload characterization and performance implications , 1997, TNET.

[26]  Kyuseok Shim,et al.  Data mining and the Web: past, present and future , 1999, WIDM '99.

[27]  John McCarthy,et al.  Phenomenal data mining: from data to phenomena , 2000, SKDD.

[28]  Philip S. Yu,et al.  Data mining for path traversal patterns in a web environment , 1996, Proceedings of 16th International Conference on Distributed Computing Systems.

[29]  Jiawei Han,et al.  Geographic Data Mining and Knowledge Discovery , 2001 .

[30]  Gregory S. Tseytin,et al.  Phenomenal Data Mining and Link Analysis. , 1998 .

[31]  Thorsten Joachims,et al.  WebWatcher : A Learning Apprentice for the World Wide Web , 1995 .

[32]  Hongjun Lu,et al.  Beyond intratransaction association analysis: mining multidimensional intertransaction association rules , 2000, TOIS.

[33]  Oren Etzioni,et al.  The World-Wide Web: quagmire or gold mine? , 1996, CACM.

[34]  Anthony K. H. Tung,et al.  Constraint-based clustering in large databases , 2001, ICDT.

[35]  Sergey Brin,et al.  The Anatomy of a Large-Scale Hypertextual Web Search Engine , 1998, Comput. Networks.

[36]  Alberto O. Mendelzon,et al.  Querying the World Wide Web , 1996, Fourth International Conference on Parallel and Distributed Information Systems.

[37]  Maurice D. Mulvenna,et al.  Discovering Internet marketing intelligence through online analytical web usage mining , 1998, SGMD.

[38]  Ray R. Larson,et al.  Bibliometrics of the World Wide Web: An Exploratory Analysis of the Intellectual Structure of Cyberspace , 1996 .

[39]  M. Mizruchi,et al.  Techniques for Disaggregating Centrality Scores in Social Networks , 1986 .

[40]  Marti A. Hearst Untangling Text Data Mining , 1999, ACL.

[41]  Jaideep Srivastava,et al.  Web usage mining: discovery and applications of usage patterns from Web data , 2000, SKDD.

[42]  Laks V. S. Lakshmanan,et al.  Mining frequent itemsets with convertible constraints , 2001, Proceedings 17th International Conference on Data Engineering.

[43]  Laks V. S. Lakshmanan,et al.  Constraint-Based Multidimensional Data Mining , 1999, Computer.

[44]  Jiawei Han,et al.  Efficient mining of partial periodic patterns in time series database , 1999, Proceedings 15th International Conference on Data Engineering (Cat. No.99CB36337).

[45]  Laks V. S. Lakshmanan,et al.  Optimization of constrained frequent set queries with 2-variable constraints , 1999, SIGMOD '99.

[46]  James J. Thomas,et al.  Visualizing the non-visual: spatial analysis and interaction with information from text documents , 1995, Proceedings of Visualization 1995 Conference.

[47]  Jiawei Han,et al.  Efficient Rule-Based Attribute-Oriented Induction for Data Mining , 2000, Journal of Intelligent Information Systems.

[48]  Jiawei Han,et al.  Data Mining: Concepts and Techniques , 2000 .

[49]  Umeshwar Dayal,et al.  PrefixSpan: Mining Sequential Patterns by Prefix-Projected Growth , 2001, ICDE 2001.

[50]  Ellen M. Voorhees,et al.  Query expansion using lexical-semantic relations , 1994, SIGIR '94.

[51]  Jiawei Han,et al.  AIM: Approximate Intelligent Matching for Time Series Data , 2000, DaWaK.

[52]  Jiawei Han,et al.  GeoMiner: a system prototype for spatial data mining , 1997, SIGMOD '97.

[53]  Soumen Chakrabarti,et al.  Distributed Hypertext Resource Discovery Through Examples , 1999, VLDB.

[54]  Israel Ben-Shaul,et al.  Automatically Organizing Bookmarks per Contents , 1996, Comput. Networks.

[55]  Jiawei Han,et al.  Discovery of Spatial Association Rules in Geographic Information Databases , 1995, SSD.

[56]  Raymond J. Mooney,et al.  A Mutually Beneficial Integration of Data Mining and Information Extraction , 2000, AAAI/IAAI.

[57]  Michael J. Pazzani,et al.  A hybrid user model for news story classification , 1999 .

[58]  Umeshwar Dayal,et al.  FreeSpan: frequent pattern-projected sequential pattern mining , 2000, KDD '00.

[59]  Josiane Mothe,et al.  TetraFusion: information discovery on the Internet , 1999, IEEE Intell. Syst..

[60]  Earl Rennison,et al.  Galaxy of news: an approach to visualizing and understanding expansive news landscapes , 1994, UIST '94.

[61]  Anthony K. H. Tung,et al.  Fault-Tolerant Frequent Pattern Mining: Problems and Challenges , 2001, DMKD.

[62]  Soumen Chakrabarti,et al.  Data mining for hypertext: a tutorial survey , 2000, SKDD.

[63]  Anupam Joshi,et al.  Data mining "to go": ubiquitous KDD for mobile and distributed environments , 2001, KDD Tutorials.

[64]  Jennifer Widom,et al.  The TSIMMIS Project: Integration of Heterogeneous Information Sources , 1994, IPSJ.

[65]  Jiawei Han,et al.  Mining MultiMedia Data , 1999 .