An examination of on-line machine learning approaches for pseudo-random generated data

A pseudo-random generator is an algorithm to generate a sequence of objects determined by a truly random seed which is not truly random. It has been widely used in many applications, such as cryptography and simulations. In this article, we examine current popular machine learning algorithms with various on-line algorithms for pseudo-random generated data in order to find out which machine learning approach is more suitable for this kind of data for prediction based on on-line algorithms. To further improve the prediction performance, we propose a novel sample weighted algorithm that takes generalization errors in each iteration into account. We perform intensive evaluation on real Baccarat data generated by Casino machines and random number generated by a popular Java program, which are two typical examples of pseudo-random generated data. The experimental results show that support vector machine and k-nearest neighbors have better performance than others with and without sample weighted algorithm in the evaluation data set.

[1]  Shasha Wang,et al.  Adapting naive Bayes tree for text classification , 2015, Knowledge and Information Systems.

[2]  Fu-Hsiang Chen,et al.  An alternative model for the analysis of detecting electronic industries earnings management using stepwise regression, random forest, and decision tree , 2016, Soft Comput..

[3]  L. M. Nithya,et al.  A Survey on Semi-Supervised Learning Techniques , 2014, ArXiv.

[4]  Belur V. Dasarathy,et al.  Nearest neighbor (NN) norms: NN pattern classification techniques , 1991 .

[5]  Santosh Kumar,et al.  Classification of Heart Disease Using Naïve Bayes and Genetic Algorithm , 2015 .

[6]  A. Hall,et al.  Adaptive Switching Circuits , 2016 .

[7]  Alistair Kennedy,et al.  Automatic Identification of Home Pages on the Web , 2005, Proceedings of the 38th Annual Hawaii International Conference on System Sciences.

[8]  Deng-ao Li,et al.  An Approach for J Wave Auto-Detection Based on Support Vector Machine , 2015, BigCom.

[9]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[10]  N. Littlestone Mistake bounds and logarithmic linear-threshold learning algorithms , 1990 .

[11]  N. Littlestone Learning Quickly When Irrelevant Attributes Abound: A New Linear-Threshold Algorithm , 1987, 28th Annual Symposium on Foundations of Computer Science (sfcs 1987).

[12]  Dattatraya S. Bormane,et al.  Automatic musical instrument classification using fractional fourier transform based- MFCC features and counter propagation neural network , 2015, Journal of Intelligent Information Systems.

[13]  J. Ross Quinlan,et al.  C4.5: Programs for Machine Learning , 1992 .

[14]  S. Hyakin,et al.  Neural Networks: A Comprehensive Foundation , 1994 .

[15]  Patrick P. K. Chan,et al.  A novel dynamic fusion method using localized generalization error model , 2009, 2009 IEEE International Conference on Systems, Man and Cybernetics.

[16]  Pedro M. Domingos,et al.  On the Optimality of the Simple Bayesian Classifier under Zero-One Loss , 1997, Machine Learning.

[17]  Ameet Talwalkar,et al.  Foundations of Machine Learning , 2012, Adaptive computation and machine learning.

[18]  Ron Kohavi,et al.  The Case against Accuracy Estimation for Comparing Induction Algorithms , 1998, ICML.

[19]  Haoliang Li,et al.  Vehicle Classification Based on Hierarchical Support Vector Machine , 2014 .

[20]  Pierre Priouret,et al.  Adaptive Algorithms and Stochastic Approximations , 1990, Applications of Mathematics.

[21]  Yi Yang,et al.  Robust hybrid name disambiguation framework for large databases , 2013, Scientometrics.

[22]  M. Pagano,et al.  Student's t test. , 1993, Nutrition.

[23]  N. Altman An Introduction to Kernel and Nearest-Neighbor Nonparametric Regression , 1992 .

[24]  Rich Caruana,et al.  An empirical comparison of supervised learning algorithms , 2006, ICML.

[25]  J. Navarro-Pedreño Numerical Methods for Least Squares Problems , 1996 .

[26]  Nello Cristianini,et al.  An Introduction to Support Vector Machines and Other Kernel-based Learning Methods , 2000 .

[27]  Elaine B. Barker,et al.  Recommendation for key management: , 2019 .

[28]  Shu-Hsien Liao,et al.  Data mining techniques and applications - A decade review from 2000 to 2011 , 2012, Expert Syst. Appl..

[29]  Aiko M. Hormann,et al.  Programs for Machine Learning. Part I , 1962, Inf. Control..

[30]  Abdelbadie Belmouhcine,et al.  Implicit Links-Based Techniques to Enrich K-Nearest Neighbors and Naive Bayes Algorithms for Web Page Classification , 2015, CORES.