Domain Knowledge-Based Compaction

Domain knowledge about the problem on hand always leads to an effective solution. In this chapter, we discuss ways to make use of domain knowledge in generating abstraction. We consider binary classifiers such as support vector machine (SVM) and adaptive boosting (AdaBoost) to classify 10-class handwritten digit data. We carry out statistical analysis on the data to derive inferences on domain knowledge. We combine it with human expert’s domain knowledge to arrive at a decision tree of depth 4 to classify 10-class data accurately. In this process, we provide an overview of multiclass classification approaches, decision trees, SVM, and AdaBoost. We combine prototype selection with both these methods to obtain high classification accuracy. In essence, the approach emphasizes exploitation of domain knowledge in mining large datasets, which in the present case results in significant compaction in the data and multiclass classification. We provide a discussion on relevant literature and a list of references at the end of the chapter.

[1]  David G. Stork,et al.  Pattern Classification , 1973 .

[2]  Bernhard Schölkopf,et al.  Extracting Support Data for a Given Task , 1995, KDD.

[3]  Bernhard Schölkopf,et al.  Training Invariant Support Vector Machines , 2002, Machine Learning.

[4]  Robert E. Schapire,et al.  Theoretical Views of Boosting and Applications , 1999, ALT.

[5]  Bernhard Schölkopf,et al.  Learning with kernels , 2001 .

[6]  Yoav Freund,et al.  A decision-theoretic generalization of on-line learning and an application to boosting , 1997, EuroCOLT.

[7]  Robert Sabourin,et al.  “One Against One” or “One Against All”: Which One is Better for Handwriting Recognition with SVMs? , 2006 .

[8]  T. Ravindra Babu,et al.  Adaptive boosting with leader based learners for classification of large handwritten data , 2004, Fourth International Conference on Hybrid Intelligent Systems (HIS'04).

[9]  Jian-xiong Dong,et al.  Fast SVM training algorithm with decomposition on very large data sets , 2005, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Yoav Freund,et al.  A Short Introduction to Boosting , 1999 .

[11]  Robert P. W. Duin,et al.  Using two-class classifiers for multiclass classification , 2002, Object recognition supported by user interaction for service robots.

[12]  Bf Buxton,et al.  An introduction to support vector machines for data mining , 2001 .

[13]  Robert E. Schapire,et al.  The Boosting Approach to Machine Learning An Overview , 2003 .

[14]  Yoram Singer,et al.  Reducing Multiclass to Binary: A Unifying Approach for Margin Classifiers , 2000, J. Mach. Learn. Res..

[15]  Federico Girosi,et al.  Training support vector machines: an application to face detection , 1997, Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition.

[16]  R. Schapire The Strength of Weak Learnability , 1990, Machine Learning.

[17]  Ryan M. Rifkin,et al.  In Defense of One-Vs-All Classification , 2004, J. Mach. Learn. Res..

[18]  Christopher J. C. Burges,et al.  A Tutorial on Support Vector Machines for Pattern Recognition , 1998, Data Mining and Knowledge Discovery.

[19]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[21]  Abdel Ghaffmr Mokamed Ahmed,et al.  One Against All , 2009 .

[22]  Guodong Guo,et al.  Face recognition by support vector machines , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[23]  David D. Denison,et al.  Nonlinear estimation and classification , 2003 .

[24]  Sreerama K. Murthy,et al.  Automatic Construction of Decision Trees from Data: A Multi-Disciplinary Survey , 1998, Data Mining and Knowledge Discovery.

[25]  Robert Tibshirani,et al.  Classification by Pairwise Coupling , 1997, NIPS.

[26]  Thorsten Joachims,et al.  Making large scale SVM learning practical , 1998 .

[27]  Glenn Fung,et al.  Multicategory Proximal Support Vector Machine Classifiers , 2005, Machine Learning.

[28]  Peter E. Hart,et al.  The condensed nearest neighbor rule (Corresp.) , 1968, IEEE Trans. Inf. Theory.

[29]  Nello Cristianini,et al.  Large Margin DAGs for Multiclass Classification , 1999, NIPS.

[30]  David G. Stork,et al.  Pattern Classification (2nd ed.) , 1999 .