On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications

As a supplementary of [Xu L. Front. Electr. Electron. Eng. China, 2010, 5(3): 281–328], this paper outlines current status of efforts made on Bayesian Ying-Yang (BYY) harmony learning, plus gene analysis applications. At the beginning, a bird’s-eye view is provided via Gaussian mixture in comparison with typical learning algorithms and model selection criteria. Particularly, semi-supervised learning is covered simply via choosing a scalar parameter. Then, essential topics and demanding issues about BYY system design and BYY harmony learning are systematically outlined, with a modern perspective on Yin-Yang viewpoint discussed, another Yang factorization addressed, and coordinations across and within Ying-Yang summarized. The BYY system acts as a unified framework to accommodate unsupervised, supervised, and semi-supervised learning all in one formulation, while the best harmony learning provides novelty and strength to automatic model selection. Also, mathematical formulation of harmony functional has been addressed as a unified scheme for measuring the proximity to be considered in a BYY system, and used as the best choice among others. Moreover, efforts are made on a number of learning tasks, including a mode-switching factor analysis proposed as a semi-blind learning framework for several types of independent factor analysis, a hidden Markov model (HMM) gated temporal factor analysis suggested for modeling piecewise stationary temporal dependence, and a two-level hierarchical Gaussian mixture extended to cover semi-supervised learning, as well as a manifold learning modified to facilitate automatic model selection. Finally, studies are applied to the problems of gene analysis, such as genome-wide association, exome sequencing analysis, and gene transcriptional regulation.

[1]  Lei Xu,et al.  Parameterizations make different model selections: Empirical findings from factor analysis , 2011 .

[2]  Matt Brown,et al.  Invited talk , 2007 .

[3]  Zhi-Hua Zhou When semi-supervised learning meets ensemble learning , 2011 .

[4]  Ning Zhong,et al.  Intelligent Technologies for Information Analysis , 2004, Springer Berlin Heidelberg.

[5]  Dan Su,et al.  Discriminative training of GMM-HMM acoustic model by RPCL learning , 2011 .

[6]  D. M. Titterington,et al.  Variational approximations in Bayesian model selection for finite mixture distributions , 2007, Comput. Stat. Data Anal..

[7]  Lei Xu,et al.  Learning Gaussian mixture with automatic model selection: A comparative study on three Bayesian related approaches , 2011 .

[8]  A. Utsugi,et al.  Bayesian Analysis of Mixtures of Factor Analyzers , 2001, Neural Computation.

[9]  James C. Liao,et al.  A Gibbs sampler for the identification of gene expression and network connectivity consistency , 2006, Bioinform..

[10]  Erkki Oja,et al.  Modified Hebbian learning for curve and surface fitting , 1992, Neural Networks.

[11]  Lei Xu,et al.  Learning Algorithms for RBF Functions and Subspace Based Functions , 2012 .

[12]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[13]  Shun-ichi Amari,et al.  Methods of information geometry , 2000 .

[14]  Lei Xu,et al.  Best Harmony, Unified RPCL and Automated Model Selection for Unsupervised and Supervised Learning on Gaussian Mixtures, Three-Layer Nets and ME-RBF-SVM Models , 2001, Int. J. Neural Syst..

[15]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[16]  Lei Xu,et al.  A unified perspective and new results on RHT computing, mixture based learning, and multi-learner based problem solving , 2007, Pattern Recognit..

[17]  Lei Xu,et al.  A binary matrix factorization algorithm for protein complex prediction , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine Workshops (BIBMW).

[18]  Lei Xu,et al.  BYY harmony learning, independent state space, and generalized APT financial analyses , 2001, IEEE Trans. Neural Networks.

[19]  Lei Xu,et al.  Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology , 2011 .

[20]  R. Redner,et al.  Mixture densities, maximum likelihood, and the EM algorithm , 1984 .

[21]  Adrian Corduneanu,et al.  Variational Bayesian Model Selection for Mixture Distributions , 2001 .

[22]  Lei Xu,et al.  Temporal Bayesian Ying-Yang dependence reduction, blind source separation and principal independent components , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[23]  Lei Xu,et al.  Temporal BYY encoding, Markovian state spaces, and space dimension determination , 2004, IEEE Transactions on Neural Networks.

[24]  Lei Xu Bayesian Ying-Yang System and Theory as a Unified Statistical Learning Approach: (V) Temporal Modeling for Temporal Perception and Control , 1998, ICONIP.

[25]  Mark J. F. Gales,et al.  The Application of Hidden Markov Models in Speech Recognition , 2007, Found. Trends Signal Process..

[26]  Christopher M. Bishop,et al.  Current address: Microsoft Research, , 2022 .

[27]  Lei Xu,et al.  How many clusters?: A Ying-Yang machine based theory for a classical open problem in pattern recognition , 1996, Proceedings of International Conference on Neural Networks (ICNN'96).

[28]  Steve J. Young,et al.  MMIE training of large vocabulary recognition systems , 1997, Speech Communication.

[29]  Lei Xu,et al.  Bayesian Kullback Ying-Yang dependence reduction theory , 1998, Neurocomputing.

[30]  Stuart C. Shapiro,et al.  Encyclopedia of artificial intelligence, vols. 1 and 2 (2nd ed.) , 1992 .

[31]  R. A. Silverman,et al.  Integral, Measure and Derivative: A Unified Approach , 1967 .

[32]  Lalit R. Bahl,et al.  Maximum mutual information estimation of hidden Markov model parameters for speech recognition , 1986, ICASSP '86. IEEE International Conference on Acoustics, Speech, and Signal Processing.

[33]  Nikola Kasabov,et al.  Brain-like Computing and Intelligent Information Systems , 1998 .

[34]  Lei Xu,et al.  Machine learning and intelligence science: Sino-foreign interchange workshop IScIDE2010 (A) , 2011 .

[35]  Richard A. Brown,et al.  Introduction to random signals and applied kalman filtering (3rd ed , 2012 .

[36]  E. Jaynes Information Theory and Statistical Mechanics , 1957 .

[37]  Lei Xu,et al.  New advances on Bayesian Ying-Yang learning system with Kullback and non-Kullback separation functionals , 1997, Proceedings of International Conference on Neural Networks (ICNN'97).

[38]  Zoubin Ghahramani,et al.  Variational Inference for Bayesian Mixtures of Factor Analysers , 1999, NIPS.

[39]  Shun-ichi Amari,et al.  Combining Classifiers and Learning Mixture-of-Experts , 2009, Encyclopedia of Artificial Intelligence.

[40]  Lawrence K. Saul,et al.  Maximum likelihood and minimum classification error factor analysis for automatic speech recognition , 2000, IEEE Trans. Speech Audio Process..

[41]  Lei Xu,et al.  BYY data smoothing based learning on a small size of samples , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[42]  Biing-Hwang Juang,et al.  Discriminative learning for minimum error classification [pattern recognition] , 1992, IEEE Trans. Signal Process..

[43]  Lei Xu,et al.  BYY learning, regularized implementation, and model selection on modular networks with one hidden layer of binary units , 2003, Neurocomputing.

[44]  P. Phillips Epistasis — the essential role of gene interactions in the structure and evolution of genetic systems , 2008, Nature Reviews Genetics.

[45]  Zoubin Ghahramani,et al.  A Unifying Review of Linear Gaussian Models , 1999, Neural Computation.

[46]  Jason H. Moore,et al.  Power of multifactor dimensionality reduction for detecting gene‐gene interactions in the presence of genotyping error, missing data, phenocopy, and genetic heterogeneity , 2003, Genetic epidemiology.

[47]  H. Cordell Detecting gene–gene interactions that underlie human diseases , 2009, Nature Reviews Genetics.

[48]  Lei Xu,et al.  Temporal factor analysis (TFA): stable-identifiable family, orthogonal flow learning, and automated model selection , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).

[49]  Adam Krzyżak,et al.  Unsupervised and supervised classifications by rival penalized competitive learning , 1992, Proceedings., 11th IAPR International Conference on Pattern Recognition. Vol.II. Conference B: Pattern Recognition Methodology and Systems.

[50]  Daniel Povey,et al.  Minimum Phone Error and I-smoothing for improved discriminative training , 2002, 2002 IEEE International Conference on Acoustics, Speech, and Signal Processing.

[51]  Lei Shi,et al.  Gene clustering by structural prior based local factor analysis model under Bayesian Ying-Yang harmony learning , 2010, 2010 IEEE International Conference on Bioinformatics and Biomedicine (BIBM).

[52]  L. Xu Bayesian Ying-Yang system, best harmony learning, and five action circling , 2010 .

[53]  Lei Xu,et al.  Temporal BYY learning for state space approach, hidden Markov model, and blind source separation , 2000, IEEE Trans. Signal Process..

[54]  L. Xu Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying-Yang Learning Perspective , 2003 .

[55]  David L. Dowe,et al.  Minimum Message Length and Kolmogorov Complexity , 1999, Comput. J..

[56]  Erkki Oja,et al.  Neural Nets for Dual Subspace Pattern Recognition Method , 1991, Int. J. Neural Syst..

[57]  D. Luenberger,et al.  Estimation of structured covariance matrices , 1982, Proceedings of the IEEE.

[58]  Manuel A. R. Ferreira,et al.  PLINK: a tool set for whole-genome association and population-based linkage analyses. , 2007, American journal of human genetics.

[59]  Hagai Attias,et al.  A Variational Bayesian Framework for Graphical Models , 1999 .

[60]  Lei Xu,et al.  RBF nets, mixture experts, and Bayesian Ying-Yang learning , 1998, Neurocomputing.

[61]  Lei Shi,et al.  Radar HRRP statistical recognition with temporal factor analysis by automatic Bayesian Ying-Yang harmony learning , 2011 .

[62]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[63]  J. Shore Minimum cross-entropy spectral analysis , 1981 .

[64]  Geoffrey E. Hinton,et al.  Autoencoders, Minimum Description Length and Helmholtz Free Energy , 1993, NIPS.

[65]  Andrzej Cichocki,et al.  New learning algorithm for blind separation of sources , 1992 .

[66]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[67]  Lei Xu,et al.  Machine learning problems from optimization perspective , 2010, J. Glob. Optim..

[68]  Mee Young Park,et al.  Penalized logistic regression for detecting gene interactions. , 2008, Biostatistics.

[69]  Biing-Hwang Juang,et al.  Minimum classification error rate methods for speech recognition , 1997, IEEE Trans. Speech Audio Process..

[70]  R. Shumway,et al.  Dynamic linear models with switching , 1991 .

[71]  Lei Xu,et al.  Bayesian Ying Yang learning , 2007, Scholarpedia.

[72]  Emilio Soria Olivas,et al.  Handbook of Research on Machine Learning Applications and Trends : Algorithms , Methods , and Techniques , 2009 .

[73]  Lei Xu,et al.  Bayesian Ying-Yang machine, clustering and number of clusters , 1997, Pattern Recognit. Lett..

[74]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[75]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[76]  Lei Xu,et al.  Advances on BYY harmony learning: information theoretic perspective, generalized projection geometry, and independent factor autodetermination , 2004, IEEE Transactions on Neural Networks.

[77]  Peter M. Williams,et al.  Bayesian Regularization and Pruning Using a Laplace Prior , 1995, Neural Computation.

[78]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[79]  Albert Ali Salah,et al.  Incremental mixtures of factor analysers , 2004, ICPR 2004.

[80]  Lei Xu,et al.  Bayesian ying-yang theory for empirical learning, regularization and model selection: general formulation , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[81]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[82]  Lei Xu Temporal BYY learning and its applications to extended Kalman filtering, hidden Markov model, and sensor-motor integration , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[83]  Geoffrey E. Hinton,et al.  Variational Learning for Switching State-Space Models , 2000, Neural Computation.

[84]  Chiara Sabatti,et al.  Network component analysis: Reconstruction of regulatory signals in biological systems , 2003, Proceedings of the National Academy of Sciences of the United States of America.

[85]  Lei Xu,et al.  Bayesian Ying Yang System, Best Harmony Learning, and Gaussian Manifold Based Family , 2008, WCCI.

[86]  Jorma Rissanen,et al.  The Minimum Description Length Principle in Coding and Modeling , 1998, IEEE Trans. Inf. Theory.

[87]  David J. C. MacKay,et al.  A Practical Bayesian Framework for Backpropagation Networks , 1992, Neural Computation.

[88]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[89]  Lei Xu,et al.  BYY harmony learning, structural RPCL, and topological self-organizing on mixture models , 2002, Neural Networks.

[90]  L. Xu Bayesian Ying Yang Learning (I): A Unified Perspective for Statistical Modeling , 2004 .