论文信息 - Learning Random Fourier Features by Hybrid Constrained Optimization

Learning Random Fourier Features by Hybrid Constrained Optimization

The kernel embedding algorithm is an important component for adapting kernel methods to large datasets. Since the algorithm consumes a major computation cost in the testing phase, we propose a novel teacher-learner framework of learning computation-efficient kernel embeddings from specific data. In the framework, the high-precision embeddings (teacher) transfer the data information to the computation-efficient kernel embeddings (learner). We jointly select informative embedding functions and pursue an orthogonal transformation between two embeddings. We propose a novel approach of constrained variational expectation maximization (CVEM), where the alternate direction method of multiplier (ADMM) is applied over a nonconvex domain in the maximization step. We also propose two specific formulations based on the prevalent Random Fourier Feature (RFF), the masked and blocked version of Computation-Efficient RFF (CERF), by imposing a random binary mask or a block structure on the transformation matrix. By empirical studies of several applications on different real-world datasets, we demonstrate that the CERF significantly improves the performance of kernel methods upon the RFF, under certain arithmetic operation requirements, and suitable for structured matrix multiplication in Fastfood type algorithms.

Jun Zhu | Jianqiao Wangni | Jingwei Zhuo

[1] Alexander J. Smola,et al. Fastfood: Approximate Kernel Expansions in Loglinear Time , 2014, ArXiv.

[2] Chih-Jen Lin,et al. LIBSVM: A library for support vector machines , 2011, TIST.

[3] Atri Rudra,et al. A Two Pronged Progress in Structured Dense Matrix Multiplication , 2016, 1611.01569.

[4] Xiaowei Zhou,et al. 3D Shape Reconstruction from 2D Landmarks: A Convex Formulation , 2014, ArXiv.

[5] Yurii Nesterov,et al. Generalized Power Method for Sparse Principal Component Analysis , 2008, J. Mach. Learn. Res..

[6] Petros Drineas,et al. On the Nyström Method for Approximating a Gram Matrix for Improved Kernel-Based Learning , 2005, J. Mach. Learn. Res..

[7] Atri Rudra,et al. Recurrence Width for Structured Dense Matrix Vector Multiplication , 2016, ArXiv.

[8] Prasoon Goyal,et al. Local Deep Kernel Learning for Efficient Non-linear SVM Prediction , 2013, ICML.

[9] Tat-Seng Chua,et al. NUS-WIDE: a real-world web image database from National University of Singapore , 2009, CIVR '09.

[10] Jun Liu,et al. Efficient Euclidean projections in linear time , 2009, ICML '09.

[11] Quanfu Fan,et al. Random Laplace Feature Maps for Semigroup Kernels on Histograms , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.