Data Processing and Knowledge Discovery in Databases

With advanced computer technologies and their omnipresent usage, data accumulates in a speed unmatchable by the human’s capacity of data processing. To meet this growing challenge, the community of knowledge discovery from databases emerged not long ago. The key issue studied by the community is, in a layman’s term, to make use of the data to our advantage. Or, why should we collect so much of it in the first place? In order to make the raw data useful, we need to represent it, process it, extract knowledge from it, present and understand knowledge for various applications. Through the first chapter, we provide the computational model of our study and the representation of data, and introduce the field of knowledge discovery from databases that evolves from many fields such as classification and clustering in statistics, pattern recognition, neural networks, machine learning, databases, exploratory data analysis, on-line analytical processing, optimization, high-performance and parallel computing, knowledge modeling, and data visualization. The ever advanced data processing technology and the increasing demand of taking advantages of data stored form a new challenge for data mining; one of the solutions to this new challenge is feature selection — the core of this study.

[1]  Wynne Hsu,et al.  Post-Analysis of Learned Rules , 1996, AAAI/IAAI, Vol. 1.

[2]  Satosi Watanabe,et al.  Pattern Recognition: Human and Mechanical , 1985 .

[3]  David W. Aha,et al.  Feature Weighting for Lazy Learning Algorithms , 1998 .

[4]  Huan Liu,et al.  IEEE Intelligent Systems , 2019, Computer.

[5]  Hiroshi Motoda,et al.  Guest Editors' Introduction: Feature Transformation and Subset Selection. , 1998 .

[6]  Abraham Silberschatz,et al.  What Makes Patterns Interesting in Knowledge Discovery Systems , 1996, IEEE Trans. Knowl. Data Eng..

[7]  Padhraic Smyth,et al.  From Data Mining to Knowledge Discovery: An Overview , 1996, Advances in Knowledge Discovery and Data Mining.

[8]  Thomas G. Dietterich Machine-Learning Research , 1997, AI Mag..

[9]  Gregory Piatetsky-Shapiro,et al.  Discovery, Analysis, and Presentation of Strong Rules , 1991, Knowledge Discovery in Databases.

[10]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[11]  W. Mendenhall,et al.  Statistics for engineering and the sciences , 1984 .

[12]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection: A Data Mining Perspective , 1998 .

[13]  Bruce Moxon,et al.  Defining data mining , 1996 .

[14]  Rajjan Shinghal,et al.  Evaluating the Interestingness of Characteristic Rules , 1996, KDD.

[15]  Sholom M. Weiss,et al.  Computer Systems That Learn , 1990 .

[16]  David J. Spiegelhalter,et al.  Machine Learning, Neural and Statistical Classification , 2009 .

[17]  Jason Catlett,et al.  On Changing Continuous Attributes into Ordered Discrete Attributes , 1991, EWSL.

[18]  Hiroshi Motoda,et al.  Feature Extraction, Construction and Selection , 1998 .