Large-scale structural learning and predicting via hashing approximation

By combining the structural information with nonparallel support vector machine, structural nonparallel support vector machine (SNPSVM) can fully exploit prior knowledge to directly improve the algorithm’s generalization capacity. However, the scalability issue how to train SNPSVM efficiently on data with huge dimensions has not been studied. In this paper, we integrate linear SNPSVM with b-bit minwise hashing scheme to speedup the training phase for large-scale and high-dimensional statistical learning, and then we address the problem of speeding-up its prediction phase via locality-sensitive hashing. For one-against-one multi-class classification problems, a two-stage strategy is put forward: a series of hash-based classifiers are built in order to approximate the exact results and filter the hypothesis space in the first stage and then the classification can be refined by solving a multi-class SNPSVM on the remaining classes in the second stage. The proposed method can deal with large-scale classification problems with a huge number of features. Experimental results on two large-scale datasets (i.e., news20 and webspam) demonstrate the efficiency of structural learning via b-bit minwise hashing. Experimental results on the ImageNet-BOF dataset, and several large-scale UCI datasets show that the proposed hash-based prediction can be more than two orders of magnitude faster than the exact classifier with minor losses in quality.

[1]  Thomas Lavergne,et al.  Tracking Web spam with HTML style similarities , 2008, TWEB.

[2]  Reshma Khemchandani,et al.  Twin Support Vector Machines for Pattern Classification , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[3]  Sreenivas Gollapudi,et al.  Less is more: sampling the neighborhood graph makes SALSA better and faster , 2009, WSDM '09.

[4]  Daniel Boley,et al.  Training Support Vector Machines Using Adaptive Clustering , 2004, SDM.

[5]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[6]  Xiaohui Liu,et al.  Structural nonparallel support vector machine for pattern recognition , 2016, Pattern Recognit..

[7]  Moses Charikar,et al.  Similarity estimation techniques from rounding algorithms , 2002, STOC '02.

[8]  Sergei Vassilvitskii,et al.  Nearest-neighbor caching for content-match applications , 2009, WWW '09.

[9]  Yo-Ping Huang,et al.  An Efficient Fuzzy Hashing Model for Image Retrieval , 2006, NAFIPS 2006 - 2006 Annual Meeting of the North American Fuzzy Information Processing Society.

[10]  Haijun Zhang,et al.  Understanding Subtitles by Character-Level Sequence-to-Sequence Learning , 2017, IEEE Transactions on Industrial Informatics.

[11]  Andrei Z. Broder,et al.  On the resemblance and containment of documents , 1997, Proceedings. Compression and Complexity of SEQUENCES 1997 (Cat. No.97TB100171).

[12]  Jason Weston,et al.  Fast Kernel Classifiers with Online and Active Learning , 2005, J. Mach. Learn. Res..

[13]  C. M. Bishop,et al.  Improvements on Twin Support Vector Machines , 2011 .

[14]  Ping Li,et al.  b-Bit Minwise Hashing for Estimating Three-Way Similarities , 2010, NIPS.

[15]  Sreenivas Gollapudi,et al.  An axiomatic approach for result diversification , 2009, WWW '09.

[16]  Katsuro Inoue,et al.  Web-service for finding cloned files using b-bit minwise hashing , 2017, 2017 IEEE 11th International Workshop on Software Clones (IWSC).

[17]  Jiawei Han,et al.  Making SVMs Scalable to Large Data Sets using Hierarchical Cluster Indexing , 2005, Data Mining and Knowledge Discovery.

[18]  Marc Najork,et al.  A large‐scale study of the evolution of Web pages , 2004, Softw. Pract. Exp..

[19]  Mikhail Belkin,et al.  Manifold Regularization: A Geometric Framework for Learning from Labeled and Unlabeled Examples , 2006, J. Mach. Learn. Res..

[20]  Ping Li,et al.  b-Bit minwise hashing , 2009, WWW '10.

[21]  Nozha Boujemaa,et al.  Hash-Based Support Vector Machines Approximation for Large Scale Prediction , 2012, BMVC.

[22]  Xingquan Zhu,et al.  Hashing Techniques , 2017 .

[23]  John Langford,et al.  Hash Kernels for Structured Data , 2009, J. Mach. Learn. Res..

[24]  John Langford,et al.  Sparse Online Learning via Truncated Gradient , 2008, NIPS.

[25]  Wei Liu,et al.  Hashing with Graphs , 2011, ICML.

[26]  Shih-Fu Chang,et al.  Circulant Binary Embedding , 2014, ICML.

[27]  Qiang Yang,et al.  Structural Regularized Support Vector Machine: A Framework for Structural Large Margin Classifier , 2011, IEEE Transactions on Neural Networks.

[28]  Daniel S. Yeung,et al.  Structured large margin machines: sensitive to data distributions , 2007, Machine Learning.

[29]  Ping Li,et al.  Hashing Algorithms for Large-Scale Learning , 2011, NIPS.

[30]  Nai-Yang Deng,et al.  Support Vector Machines: Optimization Based Theory, Algorithms, and Extensions , 2012 .

[31]  Shih-Fu Chang,et al.  Compact hashing for mixed image-keyword query over multi-label images , 2012, ICMR '12.

[32]  Bernhard Schölkopf,et al.  Support Vector Machine Applications in Computational Biology , 2004 .

[33]  Vladimir Vapnik,et al.  Statistical learning theory , 1998 .

[34]  Rong Jin,et al.  Semisupervised SVM batch mode active learning with applications to image retrieval , 2009, TOIS.

[35]  Srinivasan Parthasarathy,et al.  A Bayesian Perspective on Locality Sensitive Hashing with Extensions for Kernel Methods , 2015, ACM Trans. Knowl. Discov. Data.

[36]  Anton H. M. Akkermans,et al.  Face recognition with renewable and privacy preserving binary templates , 2005, Fourth IEEE Workshop on Automatic Identification Advanced Technologies (AutoID'05).

[37]  Stan Sclaroff,et al.  Adaptive Hashing for Fast Similarity Search , 2015, 2015 IEEE International Conference on Computer Vision (ICCV).

[38]  William Stafiord Noble,et al.  Support vector machine applications in computational biology , 2004 .

[39]  I. Tsang,et al.  om Low-Rank Approximation and Error Analysis , 2008 .

[40]  Chih-Jen Lin,et al.  Large Linear Classification When Data Cannot Fit in Memory , 2011, TKDD.

[41]  Nicole Immorlica,et al.  Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.

[42]  Mohamed Cheriet,et al.  Model selection for the LS-SVM. Application to handwriting recognition , 2009, Pattern Recognit..

[43]  Adnan Khashman,et al.  Deep learning in vision-based static hand gesture recognition , 2017, Neural Computing and Applications.

[44]  Ying-jie Tian,et al.  Improved twin support vector machine , 2013, Science China Mathematics.

[45]  Yong Shi,et al.  ν-Nonparallel support vector machine for pattern classification , 2014, Neural Computing and Applications.

[46]  Svetlana Lazebnik,et al.  Iterative quantization: A procrustean approach to learning binary codes , 2011, CVPR 2011.

[47]  Shuicheng Yan,et al.  Non-Metric Locality-Sensitive Hashing , 2010, AAAI.

[48]  Inderjit S. Dhillon,et al.  A Divide-and-Conquer Solver for Kernel Support Vector Machines , 2013, ICML.

[49]  Tommy W. S. Chow,et al.  Object-Level Video Advertising: An Optimization Framework , 2017, IEEE Transactions on Industrial Informatics.

[50]  Ivor W. Tsang,et al.  Improved Nyström low-rank approximation and error analysis , 2008, ICML '08.