Data Mining: Classification and Prediction

Data mining is the set of computational techniques and methodologies aimed to extract knowledge from a large amount of data, by using sophisticated data analysis tools to highlight information structure underlying large data sets. Machine learning methods represent one of these tools, allowing, not only data management but also analysis and prediction operations. Supervised learning, a kind of machine learning methodology, uses input data and products outputs of two type: qualitative and quantitative, respectively describing data classes and predicting data trends. Classification task provides qualitative responses whereas prediction or regression task offers quantitative outputs

[1]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[2]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[3]  Geoffrey E. Hinton,et al.  Deep Learning , 2015, Nature.

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  Bing Liu,et al.  Web Data Mining: Exploring Hyperlinks, Contents, and Usage Data , 2006, Data-Centric Systems and Applications.

[6]  Ryszard S. Michalski,et al.  Synthesis of Optimal and Quasi-Optimal Variable-Valued Logic Formulas , 1975 .

[7]  Kilian Stoffel,et al.  Theoretical Comparison between the Gini Index and Information Gain Criteria , 2004, Annals of Mathematics and Artificial Intelligence.

[8]  J. Ross Quinlan,et al.  Learning logical definitions from relations , 1990, Machine Learning.

[9]  Thair Nu Phyu Survey of Classification Techniques in Data Mining , 2009 .

[10]  Rajeev Motwani,et al.  Beyond market baskets: generalizing association rules to correlations , 1997, SIGMOD '97.

[11]  Earl B. Hunt,et al.  Concept learning,: An information processing problem , 1974 .

[12]  Philip J. Stone,et al.  Experiments in induction , 1966 .

[13]  Ryszard S. Michalski,et al.  Pattern Recognition as Rule-Guided Inductive Inference , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[14]  Marvin Minsky,et al.  Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.

[15]  Jian Pei,et al.  Mining frequent patterns without candidate generation , 2000, SIGMOD 2000.

[16]  Christopher M. Bishop,et al.  Neural networks for pattern recognition , 1995 .

[17]  Gunnar Rätsch,et al.  An introduction to kernel-based learning algorithms , 2001, IEEE Trans. Neural Networks.

[18]  Anil K. Jain,et al.  Artificial Neural Networks: A Tutorial , 1996, Computer.

[19]  Johannes Fürnkranz,et al.  ROC ‘n’ Rule Learning—Towards a Better Understanding of Covering Algorithms , 2005, Machine Learning.

[20]  Heikki Mannila,et al.  Fast Discovery of Association Rules , 1996, Advances in Knowledge Discovery and Data Mining.

[21]  Frank Rosenblatt,et al.  PRINCIPLES OF NEURODYNAMICS. PERCEPTRONS AND THE THEORY OF BRAIN MECHANISMS , 1963 .

[22]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[23]  William W. Cohen Fast Effective Rule Induction , 1995, ICML.

[24]  Peter Clark,et al.  The CN2 Induction Algorithm , 1989, Machine Learning.

[25]  Vladimir N. Vapnik,et al.  The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.

[26]  Jason Weston,et al.  Support vector machines for multi-class pattern recognition , 1999, ESANN.

[27]  Ramakrishnan Srikant,et al.  Fast algorithms for mining association rules , 1998, VLDB 1998.

[28]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[29]  Ryszard S. Michalski,et al.  On the Quasi-Minimal Solution of the General Covering Problem , 1969 .

[30]  Li Yingxin and Ruan Xiaogang,et al.  Feature Selection for Cancer Classification Based on Support Vector Machine , 2005 .

[31]  Elena Baralis,et al.  On support thresholds in associative classification , 2004, SAC '04.

[32]  D. Haussler,et al.  Boolean Feature Discovery in Empirical Learning , 1990, Machine Learning.

[33]  Thomas Hofmann,et al.  Large Margin Methods for Structured and Interdependent Output Variables , 2005, J. Mach. Learn. Res..

[34]  Ulrich H.-G. Kreßel,et al.  Pairwise classification and support vector machines , 1999 .

[35]  Sagar Kulkarni,et al.  Knowledge Discovery in Text Mining using Association Rule Extraction , 2016 .

[36]  Bernhard Schölkopf,et al.  Learning with Kernels: Support Vector Machines, Regularization, Optimization, and Beyond , 2005, IEEE Transactions on Neural Networks.

[37]  Bernard Widrow,et al.  Adaptive switching circuits , 1988 .

[38]  Ligang Zhou,et al.  One versus one multi-class classification fusion using optimizing decision directed acyclic graph for predicting listing status of companies , 2017, Inf. Fusion.

[39]  Nello Cristianini,et al.  An introduction to Support Vector Machines , 2000 .

[40]  Zhe Wang,et al.  Multi-Class Support Vector Machine , 2014 .

[41]  David J. C. MacKay,et al.  Information Theory, Inference, and Learning Algorithms , 2004, IEEE Transactions on Information Theory.

[42]  McKinney Wes,et al.  Python for Data Analysis , 2012 .

[43]  Tomasz Imielinski,et al.  Mining association rules between sets of items in large databases , 1993, SIGMOD Conference.

[44]  Wynne Hsu,et al.  Integrating Classification and Association Rule Mining , 1998, KDD.

[45]  JOHANNES FÜRNKRANZ,et al.  Separate-and-Conquer Rule Learning , 1999, Artificial Intelligence Review.

[46]  S. Sukumaran,et al.  A study on classification techniques in data mining , 2013, 2013 Fourth International Conference on Computing, Communications and Networking Technologies (ICCCNT).

[47]  R. Lawrence Rule-Based Classification Systems Using Classification and Regression Tree (CART) Analysis , 2001 .

[48]  J. Ross Quinlan,et al.  Induction of Decision Trees , 1986, Machine Learning.

[49]  Graham J. Williams,et al.  Data Mining , 2000, Communications in Computer and Information Science.

[50]  Koby Crammer,et al.  On the Algorithmic Implementation of Multiclass Kernel-based Vector Machines , 2002, J. Mach. Learn. Res..

[51]  Robert C. Holte,et al.  Very Simple Classification Rules Perform Well on Most Commonly Used Datasets , 1993, Machine Learning.

[52]  Geoffrey E. Hinton,et al.  Learning internal representations by error propagation , 1986 .

[53]  Anjana Gosain,et al.  A comprehensive survey of association rules on quantitative data in data mining , 2013, 2013 IEEE CONFERENCE ON INFORMATION AND COMMUNICATION TECHNOLOGIES.

[54]  Thorsten Joachims,et al.  Text Categorization with Support Vector Machines: Learning with Many Relevant Features , 1998, ECML.

[55]  Chih-Jen Lin,et al.  A comparison of methods for multiclass support vector machines , 2002, IEEE Trans. Neural Networks.

[56]  Nasser M. Nasrabadi,et al.  Pattern Recognition and Machine Learning , 2006, Technometrics.

[57]  Jadzia Cendrowska,et al.  PRISM: An Algorithm for Inducing Modular Rules , 1987, Int. J. Man Mach. Stud..

[58]  W. Pitts,et al.  A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.

[59]  Peter Harrington,et al.  Machine Learning in Action , 2012 .

[60]  Leo Breiman,et al.  Classification and Regression Trees , 1984 .

[61]  Athanasios V. Vasilakos,et al.  Data Mining for the Internet of Things: Literature Review and Challenges , 2015, Int. J. Distributed Sens. Networks.

[62]  R. Mike Cameron-Jones,et al.  FOIL: A Midterm Report , 1993, ECML.

[63]  Philip S. Yu,et al.  Top 10 algorithms in data mining , 2007, Knowledge and Information Systems.

[64]  Witold Pedrycz,et al.  Unsupervised Learning: Association Rules , 2007 .

[65]  Jiawei Han,et al.  CPAR: Classification based on Predictive Association Rules , 2003, SDM.

[66]  Vipin Kumar,et al.  Association Analysis Techniques for Bioinformatics Problems , 2009, BICoB.

[67]  Jian Pei,et al.  CMAR: accurate and efficient classification based on multiple class-association rules , 2001, Proceedings 2001 IEEE International Conference on Data Mining.

[68]  Daniel T. Larose,et al.  Discovering Knowledge in Data: An Introduction to Data Mining , 2005 .

[69]  Lucila Ohno-Machado,et al.  Logistic regression and artificial neural network classification models: a methodology review , 2002, J. Biomed. Informatics.

[70]  Xindong Wu,et al.  The Top Ten Algorithms in Data Mining , 2009 .

[71]  D. Lavanya,et al.  Performance Evaluation of Decision Tree Classifiers on Medical Datasets , 2011 .

[72]  Tao Li,et al.  A comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression , 2004, Bioinform..