2015 Ieee International Conference on Big Data (big Data) Iteratively Refining Svms Using Priors

Research on scalable machine learning algorithms has gained a considerable amount of traction since the exponential growth in data assets during the past decades. Many Big Data applications resort to somewhat "simple" data modelling techniques due to the computational constraints associated with more complex models. Simple models, while being very efficient to estimate, often fail to capture some of the finer details of more complex datasets. In this manuscript, we explore the idea that complex large scale classification can be tractable using a process of iterative refining. In such a process, we focus on non-linearities of the data only after having first found an approximate linear model. This knowledge is then incorporated into the nonlinear model implicitly, allowing the non-linear model to focus on important parts of the data after a rough first estimation. This in turn reduces overall training time and allows for a richer model representation, eventually leading to more predictive power.

[1]  Yoram Singer,et al.  Pegasos: primal estimated sub-gradient solver for SVM , 2007, ICML '07.

[2]  Steve R. Gunn,et al.  Result Analysis of the NIPS 2003 Feature Selection Challenge , 2004, NIPS.

[3]  Bart Baesens,et al.  Forecasting and analyzing insurance companies' ratings , 2007 .

[4]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[5]  K. Perez Nuclear Instruments and Methods in Physics Research Section A: Accelerators, Spectrometers, Detectors and Associated Equipment , 2014 .

[6]  Rómer Rosales,et al.  Simple and Scalable Response Prediction for Display Advertising , 2014, ACM Trans. Intell. Syst. Technol..

[7]  Chia-Hua Ho,et al.  Recent Advances of Large-Scale Linear Classification , 2012, Proceedings of the IEEE.

[8]  B. Roe,et al.  Boosted decision trees as an alternative to artificial neural networks for particle identification , 2004, physics/0408124.

[9]  N. Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods: Kernel-Induced Feature Spaces , 2000 .

[10]  Foster J. Provost,et al.  Scalable hands-free transfer learning for online advertising , 2014, KDD.

[11]  David H. Mathews,et al.  Detection of non-coding RNAs on the basis of predicted secondary structure formation free energy change , 2006, BMC Bioinformatics.

[12]  Massimiliano Pontil,et al.  Regularized multi--task learning , 2004, KDD.

[13]  J. Platt Sequential Minimal Optimization : A Fast Algorithm for Training Support Vector Machines , 1998 .

[14]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[15]  P. Baldi,et al.  Searching for exotic particles in high-energy physics with deep learning , 2014, Nature Communications.

[16]  Gérard Bloch,et al.  Incorporating prior knowledge in support vector machines for classification: A review , 2008, Neurocomputing.