Deep Bayesian network architecture for Big Data mining

Classical Datamining methods are facing various challenges in the era of Big Data. Between the need of fast knowledge extraction and the high flows of data acquired in small slots of time, these methods became shifted. The variability and the veracity of the Big Data perplex the Machine Learning process. The high volume of Big Data yields to a congested learning because the classic methods are designed for small sets of features. Deep Learning has recently emerged in the aim of handling voluminous data. The concept of the Deep induces the conversion of the features into a new abstracted representation in order to optimize an objective. Although the Deep Learning methods are experimentally promising, their parameterization is exhaustive and empirical. To tackle these problems, we utilize the causality and the uncertainty of the Bayesian Network in order to propose a new Deep Bayesian Network architecture. We provide a new learning algorithm for this multi‐layered Bayesian Network with latent variables. We evaluate the proposed architecture and learning algorithms over benchmark datasets. We used high‐dimensional data in order to simulate the Big Data challenges, which are imposed by the volume and veracity aspects. We demonstrate the effectiveness of our contribution under these constraints.

[1]  Christopher K. I. Williams,et al.  Greedy Learning of Binary Latent Trees , 2011, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[2]  Tomas Kocka,et al.  Effective Dimensions of Hierarchical Latent Class Models , 2011, J. Artif. Intell. Res..

[3]  Tanner,et al.  Big Data Acquisition , 2014 .

[4]  Tobi Delbruck,et al.  Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..

[5]  David Maxwell Chickering,et al.  Learning Bayesian Networks is , 1994 .

[6]  Anima Anandkumar,et al.  Tensor decompositions for learning latent variable models , 2012, J. Mach. Learn. Res..

[7]  Huan Liu,et al.  Feature Selection for High-Dimensional Data: A Fast Correlation-Based Filter Solution , 2003, ICML.

[8]  David Stuart,et al.  The Data Revolution: Big Data, Open Data, Data Infrastructures and Their Consequences , 2015, Online Inf. Rev..

[9]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[10]  Thomas D. Nielsen,et al.  Classification using Hierarchical Naïve Bayes models , 2006, Machine Learning.

[11]  Wei Fan,et al.  Mining big data: current status, and forecast to the future , 2013, SKDD.

[12]  Hsu-Yung Cheng,et al.  Vehicle Detection in Aerial Surveillance Using Dynamic Bayesian Networks , 2012, IEEE Transactions on Image Processing.

[13]  Keqiu Li,et al.  Big Data Processing in Cloud Computing Environments , 2012, 2012 12th International Symposium on Pervasive Systems, Algorithms and Networks.

[14]  Tomas Kocka,et al.  Efficient learning of hierarchical latent class models , 2004, 16th IEEE International Conference on Tools with Artificial Intelligence.

[15]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[16]  Wolfgang Wahlster,et al.  New Horizons for a Data-Driven Economy , 2016, Springer International Publishing.

[17]  Derya Birant,et al.  ST-DBSCAN: An algorithm for clustering spatial-temporal data , 2007, Data Knowl. Eng..

[18]  Ehl Emile Aarts,et al.  Simulated annealing and Boltzmann machines , 2003 .

[19]  Constantin F. Aliferis,et al.  Algorithms for Large Scale Markov Blanket Discovery , 2003, FLAIRS.

[20]  Theodore Tryfonas,et al.  Risk Assessment for Mobile Systems Through a Multilayered Hierarchical Bayesian Network , 2016, IEEE Transactions on Cybernetics.

[21]  Walid Mahdi,et al.  A New Equilibrium Criterion for Learning the Cardinality of Latent Variables , 2015, 2015 IEEE 27th International Conference on Tools with Artificial Intelligence (ICTAI).

[22]  James Kelly,et al.  AutoClass: A Bayesian Classification System , 1993, ML.

[23]  Nir Friedman,et al.  Learning Bayesian Network Structure from Massive Datasets: The "Sparse Candidate" Algorithm , 1999, UAI.

[24]  Geoffrey E. Hinton,et al.  A Learning Algorithm for Boltzmann Machines , 1985, Cogn. Sci..

[25]  Michael I. Jordan,et al.  Mean Field Theory for Sigmoid Belief Networks , 1996, J. Artif. Intell. Res..

[26]  Jenq-Neng Hwang,et al.  Object-based analysis and interpretation of human motion in sports video sequences by dynamic bayesian networks , 2003, Comput. Vis. Image Underst..

[27]  Hsinchun Chen,et al.  Large-scale regulatory network analysis from microarray data: modified Bayesian network learning and association rule mining , 2007, Decis. Support Syst..

[28]  Seong-Whan Lee,et al.  Human gesture recognition using a simplified dynamic Bayesian network , 2014, Multimedia Systems.

[29]  Corinna Cortes,et al.  Support-Vector Networks , 1995, Machine Learning.

[30]  Honglak Lee,et al.  Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.

[31]  이상헌,et al.  Deep Belief Networks , 2010, Encyclopedia of Machine Learning.

[32]  Kensuke Okada,et al.  A Bayesian approach to modeling group and individual differences in multidimensional scaling , 2016 .

[33]  M. Hilbert,et al.  Big Data for Development: A Review of Promises and Challenges , 2016 .

[34]  Vladimir Pavlovic,et al.  Audio-visual speaker detection using dynamic Bayesian networks , 2000, Proceedings Fourth IEEE International Conference on Automatic Face and Gesture Recognition (Cat. No. PR00580).

[35]  Rob Kitchin,et al.  The data revolution : big data, open data, data infrastructures & their consequences , 2014 .

[36]  Jake K. Aggarwal,et al.  A hierarchical Bayesian network for event recognition of human actions and interactions , 2004, Multimedia Systems.

[37]  Pere Caminal,et al.  Drift Compensation of Gas Sensor Array Data by Common Principal Component Analysis , 2010 .

[38]  Tao Chen,et al.  Latent tree models and diagnosis in traditional Chinese medicine , 2008, Artif. Intell. Medicine.

[39]  Philippe Leray,et al.  A hierarchical Bayesian network approach for linkage disequilibrium modeling and data-dimensionality reduction prior to genome-wide association studies , 2011, BMC Bioinformatics.

[40]  Tao Chen,et al.  Latent Tree Models and Approximate Inference in Bayesian Networks , 2008, AAAI.

[41]  Tara N. Sainath,et al.  Auto-encoder bottleneck features using deep belief networks , 2012, 2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[42]  Peter A. Flach,et al.  Hierarchical Bayesian Networks: An Approach to Classification and Learning for Structured Data , 2004, SETN.

[43]  Tengfei Liu,et al.  A Survey on Latent Tree Models and Applications , 2013, J. Artif. Intell. Res..

[44]  S. Scott,et al.  Making efficient learning algorithms with exponentially many features , 2004 .

[45]  Jose M. Peña,et al.  Uni- and Multi-Dimensional Clustering Via Bayesian Networks , 2016 .

[46]  郑肇葆,et al.  基于Naive Bayes Classifiers的航空影像纹理分类 , 2006 .

[47]  S. R,et al.  Data Mining with Big Data , 2017, 2017 11th International Conference on Intelligent Systems and Control (ISCO).

[48]  Yi Zhang,et al.  Efficient bayesian hierarchical user modeling for recommendation system , 2007, SIGIR.

[49]  Axel-Cyrille Ngonga Ngomo,et al.  Big Data Acquisition , 2016, New Horizons for a Data-Driven Economy.

[50]  John F. Tanner,et al.  Analytics and Dynamic Customer Strategy: Big Profits from Big Data , 2014 .

[51]  Andrew W. Moore,et al.  X-means: Extending K-means with Efficient Estimation of the Number of Clusters , 2000, ICML.

[52]  Ding-Zhu Du,et al.  A Decision Criterion for the Optimal Number of Clusters in Hierarchical Clustering , 2003, J. Glob. Optim..

[53]  David Maxwell Chickering,et al.  Large-Sample Learning of Bayesian Networks is NP-Hard , 2002, J. Mach. Learn. Res..

[54]  Salma Jamoussi,et al.  Weighted ensemble learning of Bayesian network for gene regulatory networks , 2015, Neurocomputing.

[55]  Jorge-Arnulfo Quiané-Ruiz,et al.  Efficient Big Data Processing in Hadoop MapReduce , 2012, Proc. VLDB Endow..

[56]  Honglak Lee,et al.  Learning hierarchical representations for face verification with convolutional deep belief networks , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[57]  Andrew W. Moore,et al.  Optimal Reinsertion: A New Search Operator for Accelerated and More Accurate Bayesian Network Structure Learning , 2003, ICML.

[58]  Max A. Little,et al.  Objective Automatic Assessment of Rehabilitative Speech Treatment in Parkinson's Disease , 2014, IEEE Transactions on Neural Systems and Rehabilitation Engineering.

[59]  Constantin F. Aliferis,et al.  The max-min hill-climbing Bayesian network structure learning algorithm , 2006, Machine Learning.

[60]  Charles Bouveyron,et al.  Model-based clustering of high-dimensional data: A review , 2014, Comput. Stat. Data Anal..

[61]  Hugo Fuks,et al.  Qualitative activity recognition of weight lifting exercises , 2013, AH.

[62]  Franz Pernkopf,et al.  On the Latent Variable Interpretation in Sum-Product Networks , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[63]  J. Utans Mixture Models and the Em Algorithm for Object Recognition within Compositional Hierarchies Part 1: Recognition , 1993 .

[64]  John Bret The University of Nebraska , 1919 .

[65]  Walid Mahdi,et al.  Semi-hierarchical naïve Bayes classifier , 2016, 2016 International Joint Conference on Neural Networks (IJCNN).

[66]  Pedro M. Domingos The Role of Occam's Razor in Knowledge Discovery , 1999, Data Mining and Knowledge Discovery.

[67]  S. C. Johnson Hierarchical clustering schemes , 1967, Psychometrika.

[68]  Tao Chen,et al.  Model-based multidimensional clustering of categorical data , 2012, Artif. Intell..

[69]  Dirk Husmeier,et al.  Hierarchical Bayesian models in ecology: Reconstructing species interaction networks from non-homogeneous species abundance data , 2012, Ecol. Informatics.

[70]  M. Buscema MetaNet: the theory of independent judges. , 1998, Substance use & misuse.

[71]  Juan José Rodríguez Diez,et al.  Rotation Forest: A New Classifier Ensemble Method , 2006, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[72]  Marco Zaffalon,et al.  Limits of Learning about a Categorical Latent Variable under Prior Near-Ignorance , 2007, Int. J. Approx. Reason..

[73]  Geoffrey E. Hinton,et al.  Acoustic Modeling Using Deep Belief Networks , 2012, IEEE Transactions on Audio, Speech, and Language Processing.

[74]  Radford M. Neal Pattern Recognition and Machine Learning , 2007, Technometrics.