An overview and perspectives on bidirectional intelligence: Lmser duality, double IA harmony, and causal computation

Advances on bidirectional intelligence are overviewed along three threads, with extensions and new perspectives. The first thread is about bidirectional learning architecture, exploring five dualities that enable Lmser six cognitive functions and provide new perspectives on which a lot of extensions and particularlly flexible Lmser are proposed. Interestingly, either or two of these dualities actually takes an important role in recent models such as U-net, ResNet, and DenseNet. The second thread is about bidirectional learning principles unified by best yIng-yAng ( IA ) harmony in BYY system. After getting insights on deep bidirectional learning from a bird-viewing on existing typical learning principles from one or both of the inward and outward directions, maximum likelihood, variational principle, and several other learning principles are summarised as exemplars of the BYY learning, with new perspectives on advanced topics. The third thread further proceeds to deep bidirectional intelligence, driven by long term dynamics ( LTD ) for parameter learning and short term dynamics ( STD ) for image thinking and rational thinking in harmony. Image thinking deals with information flow of continuously valued arrays and especially image sequence, as if thinking was displayed in the real world, exemplified by the flow from inward encoding / cognition to outward reconstruction / transformation performed in Lmser learning and BYY learning. In contrast, rational thinking handles symbolic strings or discretely valued vectors, performing uncertainty reasoning and problem solving. In particular, a general thesis is proposed for bidirectional intelligence, featured by BYY intelligence potential theory ( BYY-IPT ) and nine essential dualities in architecture, fundamentals, and implementation, respectively. Then, problems of combinatorial solving and uncertainty reasoning are investigated from this BYY IPT perspective. First, variants and extensions are suggested for AlphaGoZero like searching tasks, such as traveling salesman problem ( TSP ) and attributed graph matching ( AGM ) that are turned into Go like problems with help of a feature enrichment technique. Second, reasoning activities are summarized under guidance of BYY IPT from the aspects of constraint satisfaction, uncertainty propagation, and path or tree searching. Particularly, causal potential theory is proposed for discovering causal direction, with two roads developed for its implementation.

[1]  S. Klasa,et al.  A PCA-like rule for pattern classification based on attributed graph , 1993, Proceedings of 1993 International Conference on Neural Networks (IJCNN-93-Nagoya, Japan).

[2]  Ole Winther,et al.  Ladder Variational Autoencoders , 2016, NIPS.

[3]  Yu-Bin Yang,et al.  Image Restoration Using Very Deep Convolutional Encoder-Decoder Networks with Symmetric Skip Connections , 2016, NIPS.

[4]  Joelle Pineau,et al.  Building End-To-End Dialogue Systems Using Generative Hierarchical Neural Network Models , 2015, AAAI.

[5]  Shun-ichi Amari,et al.  Combining Classifiers and Learning Mixture-of-Experts , 2009, Encyclopedia of Artificial Intelligence.

[6]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[7]  Judea Pearl,et al.  The International Journal of Biostatistics C AUSAL I NFERENCE An Introduction to Causal Inference , 2011 .

[8]  Lei Xu,et al.  Learning Algorithms for RBF Functions and Subspace Based Functions , 2012 .

[9]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[10]  Lei Xu,et al.  Least MSE reconstruction by self-organization. I. Multi-layer neural-nets , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.

[11]  B. Efron The jackknife, the bootstrap, and other resampling plans , 1987 .

[12]  Lei Xu,et al.  Further advances on Bayesian Ying-Yang harmony learning , 2015, Applied Informatics.

[13]  Rex B. Kline,et al.  Principles and Practice of Structural Equation Modeling , 1998 .

[14]  Anil K. Jain,et al.  Unsupervised Learning of Finite Mixture Models , 2002, IEEE Trans. Pattern Anal. Mach. Intell..

[15]  Yang Song,et al.  Age Progression/Regression by Conditional Adversarial Autoencoder , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[16]  Demis Hassabis,et al.  Mastering the game of Go without human knowledge , 2017, Nature.

[17]  Daan Wierstra,et al.  One-Shot Generalization in Deep Generative Models , 2016, ICML.

[18]  Geoffrey E. Hinton,et al.  The Helmholtz Machine , 1995, Neural Computation.

[19]  King-Sun Fu,et al.  Error-Correcting Isomorphisms of Attributed Relational Graphs for Pattern Analysis , 1979, IEEE Transactions on Systems, Man, and Cybernetics.

[20]  Jesfis Peral,et al.  Heuristics -- intelligent search strategies for computer problem solving , 1984 .

[21]  Jorma Rissanen,et al.  Information and Complexity in Statistical Modeling , 2006, ITW.

[22]  Raquel Urtasun,et al.  The Reversible Residual Network: Backpropagation Without Storing Activations , 2017, NIPS.

[23]  Lei Xu,et al.  Sketch-pix2seq: a Model to Generate Sketches of Multiple Categories , 2017, ArXiv.

[24]  Lei Xu,et al.  Codimensional matrix pairing perspective of BYY harmony learning: hierarchy of bilinear systems, joint decomposition of data-covariance, and applications of network biology , 2011 .

[25]  Gerhard Reinelt,et al.  Traveling salesman problem , 2012 .

[26]  Bernhard Schölkopf,et al.  Causal Inference on Discrete Data Using Additive Noise Models , 2009, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Geoffrey E. Hinton,et al.  The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.

[28]  Peter Spirtes,et al.  Causal discovery and inference: concepts and recent methodological advances , 2016, Applied Informatics.

[29]  Antonio Torralba,et al.  Generating the Future with Adversarial Transformers , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[30]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[31]  I. Guyon,et al.  Causal Generative Neural Networks , 2017, 1711.08936.

[32]  Thomas Brox,et al.  U-Net: Convolutional Networks for Biomedical Image Segmentation , 2015, MICCAI.

[33]  Demis Hassabis,et al.  Mastering the game of Go with deep neural networks and tree search , 2016, Nature.

[34]  Lei Xu,et al.  BYY harmony learning, independent state space, and generalized APT financial analyses , 2001, IEEE Trans. Neural Networks.

[35]  Geoffrey E. Hinton,et al.  An Alternative Model for Mixtures of Experts , 1994, NIPS.

[36]  S. Wright The Method of Path Coefficients , 1934 .

[37]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[38]  Stefano Ermon,et al.  Learning Hierarchical Features from Deep Generative Models , 2017, ICML.

[39]  Garrison W. Cottrell,et al.  Image compression by back-propagation: An example of extensional programming , 1988 .

[40]  Lei Xu,et al.  Structuring causal tree models with continuous variables , 1987, Int. J. Approx. Reason..

[41]  Joelle Pineau,et al.  A Hierarchical Latent Variable Encoder-Decoder Model for Generating Dialogues , 2016, AAAI.

[42]  Nils J. Nilsson,et al.  A Formal Basis for the Heuristic Determination of Minimum Cost Paths , 1968, IEEE Trans. Syst. Sci. Cybern..

[43]  Lei Xu,et al.  Deep bidirectional intelligence: AlphaZero, deep IA-search, deep IA-infer, and TPC causal learning , 2018, Applied Informatics.

[44]  Erkki Oja,et al.  Rival penalized competitive learning for clustering analysis, RBF net, and curve detection , 1993, IEEE Trans. Neural Networks.

[45]  Chih-Min Lin,et al.  Generative Adversarial Nets in Robotic Chinese Calligraphy , 2018, 2018 IEEE International Conference on Robotics and Automation (ICRA).

[46]  Kurt Hornik,et al.  Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.

[47]  Dana H. Ballard,et al.  Modular Learning in Neural Networks , 1987, AAAI.

[48]  L. Xu Bayesian Ying-Yang system, best harmony learning, and five action circling , 2010 .

[49]  Giovanni Pilato,et al.  Creative Robot Dance with Variational Encoder , 2017, ICCC.

[50]  Aapo Hyvärinen,et al.  On the Identifiability of the Post-Nonlinear Causal Model , 2009, UAI.

[51]  Geoffrey E. Hinton,et al.  Learning and relearning in Boltzmann machines , 1986 .

[52]  C. Blumberg Causal Inference for Statistics, Social, and Biomedical Sciences: An Introduction , 2016 .

[53]  Lei Xu,et al.  Data smoothing regularization, multi-sets-learning, and problem solving strategies , 2003, Neural Networks.

[54]  Ricardo Matsumura de Araújo,et al.  On the Performance of GoogLeNet and AlexNet Applied to Sketches , 2016, AAAI.

[55]  Erkki Oja,et al.  Improved Simulated Annealing, Boltzmann Machine, and Attributed Graph Matching , 1990, EURASIP Workshop.

[56]  Ting-Chun Wang,et al.  Image Inpainting for Irregular Holes Using Partial Convolutions , 2018, ECCV.

[57]  Lei Xu,et al.  Machine learning and causal analyses for modeling financial and economic data , 2018, Applied Informatics.

[58]  Ravi Kiran Sarvadevabhatla,et al.  DeLiGAN: Generative Adversarial Networks for Diverse and Limited Data , 2017, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[59]  Lei Xu,et al.  On essential topics of BYY harmony learning: Current status, challenging issues, and gene analysis applications , 2012 .

[60]  John J. Hopfield,et al.  Neural networks and physical systems with emergent collective computational abilities , 1999 .

[61]  Lei Xu,et al.  A globally convergent Lagrange and barrier function iterative algorithm for the traveling salesman problem , 2001, Neural Networks.

[62]  Lei Xu Adding Learned Expectation Into the Learning Procedure of Self-Organizing Maps , 1990, Int. J. Neural Syst..

[63]  Bernt Schiele,et al.  Learning What and Where to Draw , 2016, NIPS.

[64]  Max Welling,et al.  Auto-Encoding Variational Bayes , 2013, ICLR.

[65]  Teuvo Kohonen,et al.  The self-organizing map , 1990 .

[66]  Mitsuo Kawato,et al.  A forward-inverse optics model of reciprocal connections between visual cortical areas , 1993 .

[67]  Aapo Hyvärinen,et al.  A Linear Non-Gaussian Acyclic Model for Causal Discovery , 2006, J. Mach. Learn. Res..

[68]  Lei Xu,et al.  RBF nets, mixture experts, and Bayesian Ying-Yang learning , 1998, Neurocomputing.

[69]  Michael I. Jordan,et al.  On Convergence Properties of the EM Algorithm for Gaussian Mixtures , 1996, Neural Computation.

[70]  L. Xu Independent Component Analysis and Extensions with Noise and Time: A Bayesian Ying-Yang Learning Perspective , 2003 .

[71]  Judea Pearl,et al.  Fusion, Propagation, and Structuring in Belief Networks , 1986, Artif. Intell..

[72]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[73]  Lei Xu,et al.  Least mean square error reconstruction principle for self-organizing neural-nets , 1993, Neural Networks.

[74]  R. Baierlein Probability Theory: The Logic of Science , 2004 .

[75]  Lei Xu Best harmony, unified RPCL and automated model selection for unsupervised and supervised learning on Gaussian mixtures, ME-RBF models and three-layer nets , 2001 .

[76]  Kilian Q. Weinberger,et al.  Densely Connected Convolutional Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[77]  Fenglong Ma,et al.  Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks , 2017, KDD.

[78]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[79]  Judea Pearl,et al.  Probabilistic reasoning in intelligent systems - networks of plausible inference , 1991, Morgan Kaufmann series in representation and reasoning.

[80]  Sebastian Nowozin,et al.  Adversarial Variational Bayes: Unifying Variational Autoencoders and Generative Adversarial Networks , 2017, ICML.

[81]  D Zipser,et al.  Learning the hidden structure of speech. , 1988, The Journal of the Acoustical Society of America.

[82]  Lei Xu,et al.  Bayesian Self-Organization , 1993, NIPS.

[83]  Yee Whye Teh,et al.  A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.

[84]  Peter Spirtes,et al.  Introduction to Causal Inference , 2010, J. Mach. Learn. Res..

[85]  Alexei A. Efros,et al.  Image-to-Image Translation with Conditional Adversarial Networks , 2016, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[86]  Jiajun Wu,et al.  Learning a Probabilistic Latent Space of Object Shapes via 3D Generative-Adversarial Modeling , 2016, NIPS.

[87]  Teuvo Kohonen,et al.  Learning vector quantization , 1998 .

[88]  Ryosuke Goto,et al.  Outfit Generation and Style Extraction via Bidirectional LSTM and Autoencoder , 2018, ArXiv.

[89]  Jian Sun,et al.  Deep Residual Learning for Image Recognition , 2015, 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[90]  Rob Fergus,et al.  Deep Generative Image Models using a Laplacian Pyramid of Adversarial Networks , 2015, NIPS.