A Novel Regularization Learning for Single-View Patterns: Multi-View Discriminative Regularization

The existing Multi-View Learning (MVL) is to discuss how to learn from patterns with multiple information sources and has been proven its superior generalization to the usual Single-View Learning (SVL). However, in most real-world cases there are just single source patterns available such that the existing MVL cannot work. The purpose of this paper is to develop a new multi-view regularization learning for single source patterns. Concretely, for the given single source patterns, we first map them into M feature spaces by M different empirical kernels, then associate each generated feature space with our previous proposed Discriminative Regularization (DR), and finally synthesize M DRs into one single learning process so as to get a new Multi-view Discriminative Regularization (MVDR), where each DR can be taken as one view of the proposed MVDR. The proposed method achieves: (1) the complementarity for multiple views generated from single source patterns; (2) an analytic solution for classification; (3) a direct optimization formulation for multi-class problems without one-against-all or one-against-one strategies.

[1]  Michael I. Jordan,et al.  Learning with Mixtures of Trees , 2001, J. Mach. Learn. Res..

[2]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[3]  Kristin P. Bennett,et al.  A Pattern Search Method for Model Selection of Support Vector Regression , 2002, SDM.

[4]  David G. Stork,et al.  Pattern Classification , 1973 .

[5]  N. Cristianini,et al.  On Kernel-Target Alignment , 2001, NIPS.

[6]  Robert P. W. Duin,et al.  Object Representation, Sample Size, and Data Set Complexity , 2006 .

[7]  Yan Zhou,et al.  Democratic co-learning , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[8]  Gunnar Rätsch,et al.  A General and Efficient Multiple Kernel Learning Algorithm , 2005, NIPS.

[9]  Nello Cristianini,et al.  A statistical framework for genomic data fusion , 2004, Bioinform..

[10]  Avinash C. Kak,et al.  PCA versus LDA , 2001, IEEE Trans. Pattern Anal. Mach. Intell..

[11]  Ivor W. Tsang,et al.  Efficient kernel feature extraction for massive data sets , 2006, KDD '06.

[12]  Songcan Chen,et al.  MultiK-MHKS: A Novel Multiple Kernel Learning Algorithm , 2008, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[13]  T. Ho,et al.  Data Complexity in Pattern Recognition , 2006 .

[14]  Christopher J. Merz,et al.  UCI Repository of Machine Learning Databases , 1996 .

[15]  Nello Cristianini,et al.  Kernel Methods for Pattern Analysis , 2003, ICTAI.

[16]  Thomas G. Dietterich What is machine learning? , 2020, Archives of Disease in Childhood.

[17]  John Shawe-Taylor,et al.  Multiclass Learning at One-class Complexity , 2005 .

[18]  Catherine Blake,et al.  UCI Repository of machine learning databases , 1998 .

[19]  Sayan Mukherjee,et al.  Choosing Multiple Parameters for Support Vector Machines , 2002, Machine Learning.

[20]  Gérard Bloch,et al.  Incorporating prior knowledge in support vector machines for classification: A review , 2008, Neurocomputing.

[21]  C. L. Philip Chen,et al.  Regularization parameter estimation for feedforward neural networks , 2003 .

[22]  Yi-Zeng Liang,et al.  Monte Carlo cross validation , 2001 .

[23]  Gunnar Rätsch,et al.  Input space versus feature space in kernel-based methods , 1999, IEEE Trans. Neural Networks.

[24]  Yves Grandvalet,et al.  More efficiency in multiple kernel learning , 2007, ICML '07.

[25]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[26]  Javier M. Moguerza,et al.  Combining Kernel Information for Support Vector Classification , 2004, Multiple Classifier Systems.

[27]  Michael I. Jordan,et al.  Multiple kernel learning, conic duality, and the SMO algorithm , 2004, ICML.

[28]  Jing Peng,et al.  SVM vs regularized least squares classification , 2004, Proceedings of the 17th International Conference on Pattern Recognition, 2004. ICPR 2004..

[29]  Qiang Yang,et al.  Discriminatively regularized least-squares classification , 2009, Pattern Recognit..

[30]  Alexander J. Smola,et al.  Learning the Kernel with Hyperkernels , 2005, J. Mach. Learn. Res..

[31]  Juan-Zi Li,et al.  Feature-Correlation Based Multi-view Detection , 2005, ICCSA.

[32]  Zhi-Hua Zhou,et al.  Analyzing Co-training Style Algorithms , 2007, ECML.

[33]  Avrim Blum,et al.  The Bottleneck , 2021, Monopsony Capitalism.

[34]  Charles A. Micchelli,et al.  Learning Multiple Tasks with Kernel Methods , 2005, J. Mach. Learn. Res..

[35]  A. N. Tikhonov,et al.  Solutions of ill-posed problems , 1977 .

[36]  M. Omair Ahmad,et al.  Optimizing the kernel in the empirical feature space , 2005, IEEE Transactions on Neural Networks.

[37]  Nello Cristianini,et al.  Learning the Kernel Matrix with Semidefinite Programming , 2002, J. Mach. Learn. Res..

[38]  Jinbo Bi,et al.  Column-generation boosting methods for mixture of kernels , 2004, KDD.

[39]  Yves Grandvalet,et al.  Adaptive Scaling for Feature Selection in SVMs , 2002, NIPS.

[40]  T Poggio,et al.  Regularization Algorithms for Learning That Are Equivalent to Multilayer Networks , 1990, Science.

[41]  Kristin P. Bennett,et al.  MARK: a boosting algorithm for heterogeneous kernel models , 2002, KDD.

[42]  Pong C. Yuen,et al.  Face Recognition by Regularized Discriminant Analysis , 2007, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[43]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[44]  Sheng Chen,et al.  Sparse kernel density construction using orthogonal forward regression with leave-one-out test score and local regularization , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).

[45]  John Shawe-Taylor,et al.  Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.

[46]  Tao Jiang,et al.  Efficient and robust feature extraction by maximum margin criterion , 2003, IEEE Transactions on Neural Networks.

[47]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[48]  Rayid Ghani,et al.  Analyzing the effectiveness and applicability of co-training , 2000, CIKM '00.

[49]  T. Poggio,et al.  The Mathematics of Learning: Dealing with Data , 2005, 2005 International Conference on Neural Networks and Brain.

[50]  Craig A. Knoblock,et al.  Active + Semi-supervised Learning = Robust Multi-View Learning , 2002, ICML.

[51]  John Shawe-Taylor,et al.  Two view learning: SVM-2K, Theory and Practice , 2005, NIPS.

[52]  Simon Haykin,et al.  On Different Facets of Regularization Theory , 2002, Neural Computation.

[53]  V. A. Morozov,et al.  Methods for Solving Incorrectly Posed Problems , 1984 .