Draft: Deep Learning in Neural Networks: An Overview
暂无分享,去创建一个
[1] Amir F. Atiya,et al. New results on recurrent network training: unifying the algorithms and accelerating convergence , 2000, IEEE Trans. Neural Networks Learn. Syst..
[2] Kiyotoshi Matsuoka,et al. Noise injection into inputs in back-propagation learning , 1992, IEEE Trans. Syst. Man Cybern..
[3] Gerald Tesauro,et al. TD-Gammon, a Self-Teaching Backgammon Program, Achieves Master-Level Play , 1994, Neural Computation.
[4] R. Zemel. A minimum description length framework for unsupervised learning , 1994 .
[5] W. Vent,et al. Rechenberg, Ingo, Evolutionsstrategie — Optimierung technischer Systeme nach Prinzipien der biologischen Evolution. 170 S. mit 36 Abb. Frommann‐Holzboog‐Verlag. Stuttgart 1973. Broschiert , 1975 .
[6] Douglas B. Lenat,et al. Why AM and EURISKO Appear to Work , 1984, Artif. Intell..
[7] Fei-Fei Li,et al. Large-Scale Video Classification with Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[8] Marvin Minsky,et al. Steps toward Artificial Intelligence , 1995, Proceedings of the IRE.
[9] Peter M. Todd,et al. Designing Neural Networks using Genetic Algorithms , 1989, ICGA.
[10] Lawrence Davis,et al. Training Feedforward Neural Networks Using Genetic Algorithms , 1989, IJCAI.
[11] Jürgen Schmidhuber,et al. Learning Complex, Extended Sequences Using the Principle of History Compression , 1992, Neural Computation.
[12] Ivan Laptev,et al. Learning and Transferring Mid-level Image Representations Using Convolutional Neural Networks , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[13] Anne Condon,et al. On the undecidability of probabilistic planning and related stochastic optimization problems , 2003, Artif. Intell..
[14] Juha Karhunen,et al. Generalizations of principal component analysis, optimization problems, and neural networks , 1995, Neural Networks.
[15] Geoffrey E. Hinton,et al. Learning Population Codes by Minimizing Description Length , 1993, Neural Computation.
[16] H. Akaike,et al. Information Theory and an Extension of the Maximum Likelihood Principle , 1973 .
[17] Peter Tiňo,et al. Learning long-term dependencies is not as difficult with NARX recurrent neural networks , 1995 .
[18] P. Földiák,et al. Forming sparse representations by local anti-Hebbian learning , 1990, Biological Cybernetics.
[19] J. Baxter,et al. Direct gradient-based reinforcement learning , 2000, 2000 IEEE International Symposium on Circuits and Systems. Emerging Technologies for the 21st Century. Proceedings (IEEE Cat No.00CH36353).
[20] Henry S. Baird,et al. Document image defect models , 1995 .
[21] N. Logothetis,et al. Shape representation in the inferior temporal cortex of monkeys , 1995, Current Biology.
[22] Vivek S. Borkar,et al. Learning Algorithms for Markov Decision Processes with Average Cost , 2001, SIAM J. Control. Optim..
[23] Mark B. Ring. Learning Sequential Tasks by Incrementally Adding Higher Orders , 1992, NIPS.
[24] Lawrence D. Jackel,et al. Backpropagation Applied to Handwritten Zip Code Recognition , 1989, Neural Computation.
[25] Ralph Neuneier,et al. How to Train Neural Networks , 2012, Neural Networks: Tricks of the Trade.
[26] Michael C. Mozer,et al. Induction of Multiscale Temporal Structure , 1991, NIPS.
[27] Jürgen Schmidhuber,et al. A Clockwork RNN , 2014, ICML.
[28] Steffen Udluft,et al. Learning Long Term Dependencies with Recurrent Neural Networks , 2006, ICANN.
[29] Richard S. Sutton,et al. Neural networks for control , 1990 .
[30] Roberto Battiti,et al. Accelerated Backpropagation Learning: Two Optimization Methods , 1989, Complex Syst..
[31] Roberto Battiti,et al. First- and Second-Order Methods for Learning: Between Steepest Descent and Newton's Method , 1992, Neural Computation.
[32] Ray J. Solomonoff,et al. A Formal Theory of Inductive Inference. Part II , 1964, Inf. Control..
[33] F. Faggin,et al. Neural network hardware , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[34] Sukhan Lee,et al. A Gaussian potential function network with hierarchically self-organizing learning , 1991, Neural Networks.
[35] Tapani Raiko,et al. Deep Learning Made Easier by Linear Transformations in Perceptrons , 2012, AISTATS.
[36] Jürgen Schmidhuber,et al. Recurrent policy gradients , 2010, Log. J. IGPL.
[37] Nikola K. Kasabov,et al. NeuCube: A spiking neural network architecture for mapping, learning and understanding of spatio-temporal brain data , 2014, Neural Networks.
[38] Janet Wiles,et al. Recurrent Neural Networks Can Learn to Implement Symbol-Sensitive Counting , 1997, NIPS.
[39] David B. Fogel,et al. Evolving Neural Control Systems , 1995, IEEE Expert.
[40] Tom M. Mitchell,et al. Explanation-Based Generalization: A Unifying View , 1986, Machine Learning.
[41] Jeffrey L. Elman,et al. Finding Structure in Time , 1990, Cogn. Sci..
[42] Mark B. Ring. Incremental Development of Complex Behaviors , 1991, ML.
[43] Benjamin Schrauwen,et al. An overview of reservoir computing: theory, applications and implementations , 2007, ESANN.
[44] Andrew W. Moore,et al. Reinforcement Learning: A Survey , 1996, J. Artif. Intell. Res..
[45] W. Pitts,et al. A Logical Calculus of the Ideas Immanent in Nervous Activity (1943) , 2021, Ideas That Created the Future.
[46] Jürgen Schmidhuber,et al. LSTM recurrent networks learn simple context-free and context-sensitive languages , 2001, IEEE Trans. Neural Networks.
[47] Marcus Hutter. The Fastest and Shortest Algorithm for all Well-Defined Problems , 2002, Int. J. Found. Comput. Sci..
[48] Jürgen Schmidhuber,et al. HQ-Learning , 1997, Adapt. Behav..
[49] Kaspar Anton Schindler,et al. When pyramidal neurons lock, when they respond chaotically, and when they like to synchronize , 2000, Neuroscience Research.
[50] Raymond L. Watrous,et al. Induction of Finite-State Automata Using Second-Order Recurrent Networks , 1991, NIPS.
[51] Henry J. Kelley,et al. Gradient Theory of Optimal Flight Paths , 1960 .
[52] E. Blum,et al. The Mathematical Theory of Optimal Processes. , 1963 .
[53] Stephan K. Chalup,et al. Incremental training of first order recurrent neural networks to predict a context-sensitive language , 2003, Neural Networks.
[54] John N. Tsitsiklis,et al. Neuro-Dynamic Programming , 1996, Encyclopedia of Machine Learning.
[55] Isabelle Guyon,et al. Structural Risk Minimization for Character Recognition , 1991, NIPS.
[56] Mitsuo Kawato,et al. Multiple Model-Based Reinforcement Learning , 2002, Neural Computation.
[57] Jordan B. Pollack,et al. Implications of Recursive Distributed Representations , 1988, NIPS.
[58] A. N. Tikhonov,et al. Solutions of ill-posed problems , 1977 .
[59] Bernard Widrow,et al. Associative Storage and Retrieval of Digital Information in Networks of Adaptive “Neurons” , 1962 .
[60] Yann LeCun,et al. Optimal Brain Damage , 1989, NIPS.
[61] Jürgen Schmidhuber,et al. Continuous history compression , 1993 .
[62] Ali A. Minai,et al. Perturbation response in feedforward networks , 1994, Neural Networks.
[63] James J. DiCarlo,et al. How Does the Brain Solve Visual Object Recognition? , 2012, Neuron.
[64] Razvan Pascanu,et al. How to Construct Deep Recurrent Neural Networks , 2013, ICLR.
[65] John N. Tsitsiklis,et al. A survey of computational complexity results in systems and control , 2000, Autom..
[66] Yoram Singer,et al. Adaptive Subgradient Methods for Online Learning and Stochastic Optimization , 2011, J. Mach. Learn. Res..
[67] Michael S. Falconbridge,et al. A Simple Hebbian/Anti-Hebbian Network Learns the Sparse, Independent Components of Natural Images , 2006, Neural Computation.
[68] Jürgen Schmidhuber,et al. Discovering Predictable Classifications , 1993, Neural Computation.
[69] Giovanni Soda,et al. Exploiting the past and the future in protein secondary structure prediction , 1999, Bioinform..
[70] Pierre Baldi,et al. Understanding Dropout , 2013, NIPS.
[71] Henry Markram,et al. The human brain project. , 2012, Scientific American.
[72] King-Sun Fu,et al. Syntactic Pattern Recognition And Applications , 1968 .
[73] E. Rolls,et al. Neurodynamics of biased competition and cooperation for attention: a model with spiking neurons. , 2005, Journal of neurophysiology.
[74] Steven Douglas Whitehead,et al. Reinforcement learning for the adaptive control of perception and action , 1992 .
[75] John E. Moody,et al. The Effective Number of Parameters: An Analysis of Generalization and Regularization in Nonlinear Learning Systems , 1991, NIPS.
[76] Ming Yang,et al. Detecting Human Actions in Surveillance Videos , 2009, TRECVID.
[77] H. Seung,et al. Learning in Spiking Neural Networks by Reinforcement of Stochastic Synaptic Transmission , 2003, Neuron.
[78] Robert Desimone,et al. Parallel and Serial Neural Mechanisms for Visual Search in Macaque Area V4 , 2005, Science.
[79] R. Schapire. The Strength of Weak Learnability , 1990, Machine Learning.
[80] Ashwin Ram,et al. Experiments with Reinforcement Learning in Problems with Continuous State and Action Spaces , 1997, Adapt. Behav..
[81] Dr. Marcus Hutter,et al. Universal artificial intelligence , 2004 .
[82] Carolo Friederico Gauss. Theoria Motus Corporum Coelestium in Sectionibus Conicis Solem Ambientium , 2014 .
[83] David Zipser,et al. Feature Discovery by Competive Learning , 1986, Cogn. Sci..
[84] Jürgen Schmidhuber,et al. The Speed Prior: A New Simplicity Measure Yielding Near-Optimal Computable Predictions , 2002, COLT.
[85] Hecht-Nielsen. Theory of the backpropagation neural network , 1989 .
[86] M. Graziano. The Intelligent Movement Machine: An Ethological Perspective on the Primate Motor System , 2008 .
[87] David Haussler,et al. What Size Net Gives Valid Generalization? , 1989, Neural Computation.
[88] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[89] Yoshua Bengio,et al. Greedy Layer-Wise Training of Deep Networks , 2006, NIPS.
[90] Jürgen Schmidhuber,et al. Solving Deep Memory POMDPs with Recurrent Policy Gradients , 2007, ICANN.
[91] Risto Miikkulainen,et al. Accelerated Neural Evolution through Cooperatively Coevolved Synapses , 2008, J. Mach. Learn. Res..
[92] R. Rohrer,et al. Automated Network Design-The Frequency-Domain Case , 1969 .
[93] Punit Shah. Toward a Neurobiology of Unrealistic Optimism , 2012, Front. Psychology.
[94] Shun-ichi Amari,et al. Statistical Theory of Learning Curves under Entropic Loss Criterion , 1993, Neural Computation.
[95] Jürgen Schmidhuber,et al. Learning to Forget: Continual Prediction with LSTM , 2000, Neural Computation.
[96] Peter Tiño,et al. Architectural Bias in Recurrent Neural Networks: Fractal Analysis , 2002, Neural Computation.
[97] Sepp Hochreiter,et al. Untersuchungen zu dynamischen neuronalen Netzen , 1991 .
[98] Christopher Kermorvant,et al. The A2iA Arabic Handwritten Text Recognition System at the Open HaRT2013 Evaluation , 2014, 2014 11th IAPR International Workshop on Document Analysis Systems.
[99] Jürgen Schmidhuber,et al. Learning to generate sub-goals for action sequences , 1991 .
[100] Dan Ciresan,et al. Multi-Column Deep Neural Networks for offline handwritten Chinese character classification , 2013, 2015 International Joint Conference on Neural Networks (IJCNN).
[101] B. Speelpenning. Compiling Fast Partial Derivatives of Functions Given by Algorithms , 1980 .
[102] D. Shanno. Conditioning of Quasi-Newton Methods for Function Minimization , 1970 .
[103] Patrice Y. Simard,et al. Best practices for convolutional neural networks applied to visual document analysis , 2003, Seventh International Conference on Document Analysis and Recognition, 2003. Proceedings..
[104] Luca Maria Gambardella,et al. Mitosis Detection in Breast Cancer Histology Images with Deep Neural Networks , 2013, MICCAI.
[105] Michael I. Jordan. Serial Order: A Parallel Distributed Processing Approach , 1997 .
[106] John R. Koza,et al. Genetic programming - on the programming of computers by means of natural selection , 1993, Complex adaptive systems.
[107] Jürgen Schmidhuber,et al. Long Short-Term Memory , 1997, Neural Computation.
[108] G. V. Puskorius,et al. A signal processing framework based on dynamic neural networks with application to problems in adaptation, filtering, and classification , 1998, Proc. IEEE.
[109] M. Hestenes,et al. Methods of conjugate gradients for solving linear systems , 1952 .
[110] Jürgen Schmidhuber,et al. Planning simple trajectories using neural subgoal generators , 1993 .
[111] Tomaso Poggio,et al. Fast Readout of Object Identity from Macaque Inferior Temporal Cortex , 2005, Science.
[112] Shimon Whiteson,et al. Critical factors in the performance of hyperNEAT , 2013, GECCO '13.
[113] Keiji Tanaka,et al. Matching Categorical Object Representations in Inferior Temporal Cortex of Man and Monkey , 2008, Neuron.
[114] J. Stephen Judd,et al. Optimal stopping and effective machine complexity in learning , 1993, Proceedings of 1995 IEEE International Symposium on Information Theory.
[115] Leslie Pack Kaelbling,et al. Planning and Acting in Partially Observable Stochastic Domains , 1998, Artif. Intell..
[116] Rajat Raina,et al. Large-scale deep unsupervised learning using graphics processors , 2009, ICML '09.
[117] Honglak Lee,et al. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations , 2009, ICML '09.
[118] Esther Levin,et al. Accelerated Learning in Layered Neural Networks , 1988, Complex Syst..
[119] Yann LeCun,et al. Off-Road Obstacle Avoidance through End-to-End Learning , 2005, NIPS.
[120] Rich Caruana,et al. Multitask Learning , 1997, Machine Learning.
[121] Vladimir N. Vapnik,et al. The Nature of Statistical Learning Theory , 2000, Statistics for Engineering and Information Science.
[122] David J. Jilk,et al. Recurrent Processing during Object Recognition , 2011, Front. Psychol..
[123] Subhash C. Kak,et al. Data Mining Using Surface and Deep Agents Based on Neural Networks , 2010, AMCIS.
[124] Randall D. Beer,et al. Sequential Behavior and Learning in Evolved Dynamical Neural Networks , 1994, Adapt. Behav..
[125] Jürgen Schmidhuber,et al. Shifting Inductive Bias with Success-Story Algorithm, Adaptive Levin Search, and Incremental Self-Improvement , 1997, Machine Learning.
[126] Pasi Koikkalainen,et al. Self-organizing hierarchical feature maps , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[127] Chalapathy Neti,et al. Maximally fault tolerant neural networks , 1992, IEEE Trans. Neural Networks.
[128] Lucas C. Parra,et al. Non-linear Feature Extraction by Redundancy Reduction in an Unsupervised Stochastic Neural Network , 1997, Neural Networks.
[129] Wulfram Gerstner,et al. Stochastic variational learning in recurrent spiking networks , 2014, Front. Comput. Neurosci..
[130] Christof Koch,et al. Unsupervised Learning of Individuals and Categories from Images , 2008, Neural Computation.
[131] Jürgen Schmidhuber,et al. Optimal Ordered Problem Solver , 2002, Machine Learning.
[132] Martin A. Riedmiller,et al. Deep auto-encoder neural networks in reinforcement learning , 2010, The 2010 International Joint Conference on Neural Networks (IJCNN).
[133] Emil L. Post. Finite combinatory processes—formulation , 1936, Journal of Symbolic Logic.
[134] Jürgen Schmidhuber,et al. Fast Online Q(λ) , 1998, Machine Learning.
[135] Dennis Gabor,et al. Theory of communication , 1946 .
[136] Nicol N. Schraudolph,et al. Fast Curvature Matrix-Vector Products for Second-Order Gradient Descent , 2002, Neural Computation.
[137] Nicholas T. Carnevale,et al. Simulation of networks of spiking neurons: A review of tools and strategies , 2006, Journal of Computational Neuroscience.
[138] Petre Stoica,et al. Decentralized Control , 2018, The Control Systems Handbook.
[139] Geoffrey E. Hinton,et al. The Helmholtz Machine , 1995, Neural Computation.
[140] Jonas Karlsson,et al. Learning via task decomposition , 1993 .
[141] R. Desimone,et al. Stimulus-selective properties of inferior temporal neurons in the macaque , 1984, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[142] Christopher Kermorvant,et al. Dropout Improves Recurrent Neural Networks for Handwriting Recognition , 2013, 2014 14th International Conference on Frontiers in Handwriting Recognition.
[143] Lee A. Feldkamp,et al. Neurocontrol of nonlinear dynamical systems with Kalman filter trained recurrent networks , 1994, IEEE Trans. Neural Networks.
[144] Shimon Whiteson,et al. Evolutionary Computation for Reinforcement Learning , 2012, Reinforcement Learning.
[145] G. Orban,et al. Model circuit of spiking neurons generating directional selectivity in simple cells. , 1996, Journal of neurophysiology.
[146] J. A. Lozano,et al. Estimation of Distribution Algorithms: A New Tool for Evolutionary Computation , 2001 .
[147] J. P. Jones,et al. An evaluation of the two-dimensional Gabor filter model of simple receptive fields in cat striate cortex. , 1987, Journal of neurophysiology.
[148] Brendan J. Frey,et al. Adaptive dropout for training deep neural networks , 2013, NIPS.
[149] Andrew W. Moore,et al. The Parti-game Algorithm for Variable Resolution Reinforcement Learning in Multidimensional State-spaces , 1993, Machine Learning.
[150] Jürgen Schmidhuber,et al. Recurrent nets that time and count , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[151] Jürgen Schmidhuber,et al. Feature Extraction Through LOCOCODE , 1999, Neural Computation.
[152] Lin Wu,et al. Learning to play Go using recursive neural networks , 2008, Neural Networks.
[153] S. Yoshizawa,et al. An Active Pulse Transmission Line Simulating Nerve Axon , 1962, Proceedings of the IRE.
[154] Janet Wiles,et al. Context-free and context-sensitive dynamics in recurrent neural networks , 2000, Connect. Sci..
[155] Maria S. Kulikova,et al. Mitosis detection in breast cancer histological images An ICPR 2012 contest , 2013, Journal of pathology informatics.
[156] D. G. Albrecht,et al. Spatial frequency selectivity of cells in macaque visual cortex , 1982, Vision Research.
[157] R. Sutton,et al. A convergent O ( n ) algorithm for off-policy temporal-difference learning with linear function approximation , 2008, NIPS 2008.
[158] Gert Cauwenberghs,et al. Event-driven contrastive divergence for spiking neuromorphic systems , 2013, Front. Neurosci..
[159] Marc'Aurelio Ranzato,et al. Efficient Learning of Sparse Representations with an Energy-Based Model , 2006, NIPS.
[160] Luis A. Plana,et al. SpiNNaker: Mapping neural networks onto a massively-parallel chip multiprocessor , 2008, 2008 IEEE International Joint Conference on Neural Networks (IEEE World Congress on Computational Intelligence).
[161] Jürgen Schmidhuber,et al. A Fixed Size Storage O(n3) Time Complexity Learning Algorithm for Fully Recurrent Continually Running Networks , 1992, Neural Computation.
[162] Zhaoping Li,et al. Understanding Retinal Color Coding from First Principles , 1992, Neural Computation.
[163] S. Grossberg. Some Networks That Can Learn, Remember, and Reproduce any Number of Complicated Space-Time Patterns, I , 1969 .
[164] Terrence J. Sejnowski,et al. Unsupervised Discrimination of Clustered Data via Optimization of Binary Information Gain , 1992, NIPS.
[165] Wulfram Gerstner,et al. Spiking Neuron Models , 2002 .
[166] Chia-Feng Juang,et al. A hybrid of genetic algorithm and particle swarm optimization for recurrent network design , 2004, IEEE Transactions on Systems, Man, and Cybernetics, Part B (Cybernetics).
[167] Martin A. Riedmiller,et al. Autonomous reinforcement learning on raw visual input data in a real world application , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[168] Jürgen Schmidhuber,et al. Optimal Artificial Curiosity, Creativity, Music, and the Fine Arts , 2005 .
[169] Tobi Delbrück,et al. Orientation-Selective aVLSI Spiking Neurons , 2001, NIPS.
[170] Jürgen Schmidhuber,et al. Classifying Unprompted Speech by Retraining LSTM Nets , 2005, ICANN.
[171] David G. Lowe,et al. Object recognition from local scale-invariant features , 1999, Proceedings of the Seventh IEEE International Conference on Computer Vision.
[172] Shixin Cheng,et al. Dynamic learning rate optimization of the backpropagation algorithm , 1995, IEEE Trans. Neural Networks.
[173] Thomas M. Breuel,et al. High-Performance OCR for Printed English and Fraktur Using LSTM Networks , 2013, 2013 12th International Conference on Document Analysis and Recognition.
[174] Shun-ichi Amari,et al. A Theory of Adaptive Pattern Classifiers , 1967, IEEE Trans. Electron. Comput..
[175] Paul J. Werbos,et al. Neural networks for control and system identification , 1989, Proceedings of the 28th IEEE Conference on Decision and Control,.
[176] Thomas G. Dietterich,et al. Editors. Advances in Neural Information Processing Systems , 2002 .
[177] Thomas Serre,et al. On the Role of Object-Specific Features for Real World Object Recognition in Biological Vision , 2002, Biologically Motivated Computer Vision.
[178] Padhraic Smyth,et al. Discrete recurrent neural networks for grammatical inference , 1994, IEEE Trans. Neural Networks.
[179] Jürgen Schmidhuber,et al. Prototype Resilient, Self-Modeling Robots , 2007, Science.
[180] Lillian Lee,et al. Learning of Context-Free Languages: A Survey of the Literature , 1996 .
[181] Johannes Stallkamp,et al. The German Traffic Sign Recognition Benchmark: A multi-class classification competition , 2011, The 2011 International Joint Conference on Neural Networks.
[182] Honglak Lee,et al. Unsupervised feature learning for audio classification using convolutional deep belief networks , 2009, NIPS.
[183] Fu-Chuang Chen,et al. Adaptive control of nonlinear systems using neural networks , 1992 .
[184] Yoram Singer,et al. The Hierarchical Hidden Markov Model: Analysis and Applications , 1998, Machine Learning.
[185] Kosko. Unsupervised learning in noise , 1989 .
[186] Narendra Ahuja,et al. Cresceptron: a self-organizing neural network which grows adaptively , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[187] Geoffrey E. Hinton,et al. Learning to Represent Spatial Transformations with Factored Higher-Order Boltzmann Machines , 2010, Neural Computation.
[188] Anders Krogh,et al. A Simple Weight Decay Can Improve Generalization , 1991, NIPS.
[189] David E. Moriarty,et al. Symbiotic Evolution of Neural Networks in Sequential Decision Tasks , 1997 .
[190] Bernd Fritzke,et al. A Growing Neural Gas Network Learns Topologies , 1994, NIPS.
[191] Jürgen Schmidhuber,et al. Netzwerkarchitekturen, Zielfunktionen und Kettenregel , 1993 .
[192] Xin Yao,et al. A review of evolutionary artificial neural networks , 1993, Int. J. Intell. Syst..
[193] Yoshua Bengio,et al. An Empirical Investigation of Catastrophic Forgeting in Gradient-Based Neural Networks , 2013, ICLR.
[194] D. Goldfarb. A family of variable-metric methods derived by variational means , 1970 .
[195] Yoshua Bengio,et al. Maxout Networks , 2013, ICML.
[196] Lawrence D. Jackel,et al. Handwritten Digit Recognition with a Back-Propagation Network , 1989, NIPS.
[197] PAUL J. WERBOS,et al. Generalization of backpropagation with application to a recurrent gas market model , 1988, Neural Networks.
[198] Michael C. Mozer,et al. Skeletonization: A Technique for Trimming the Fat from a Network via Relevance Assessment , 1988, NIPS.
[199] Yoshua Bengio,et al. Spike-and-Slab Sparse Coding for Unsupervised Feature Discovery , 2012, ArXiv.
[200] Ingo Rechenberg,et al. Evolutionsstrategie : Optimierung technischer Systeme nach Prinzipien der biologischen Evolution , 1973 .
[201] Doina Precup,et al. Multi-time Models for Temporally Abstract Planning , 1997, NIPS.
[202] Jürgen Schmidhuber,et al. Training Recurrent Networks by Evolino , 2007, Neural Computation.
[203] Yoshua Bengio,et al. Why Does Unsupervised Pre-training Help Deep Learning? , 2010, AISTATS.
[204] Peter Lennie,et al. Coding of color and form in the geniculostriate visual pathway (invited review). , 2005, Journal of the Optical Society of America. A, Optics, image science, and vision.
[206] Michael L. Littman,et al. Algorithms for Sequential Decision Making , 1996 .
[207] Kumpati S. Narendra,et al. Identification and control of dynamical systems using neural networks , 1990, IEEE Trans. Neural Networks.
[208] Yishay Mansour,et al. Policy Gradient Methods for Reinforcement Learning with Function Approximation , 1999, NIPS.
[209] D. B. Fogel,et al. Evolving neural networks , 1990, Biological Cybernetics.
[210] Narendra Ahuja,et al. Learning Recognition and Segmentation Using the Cresceptron , 1997, International Journal of Computer Vision.
[211] Leo Breiman,et al. Bagging Predictors , 1996, Machine Learning.
[212] Kenneth Levenberg. A METHOD FOR THE SOLUTION OF CERTAIN NON – LINEAR PROBLEMS IN LEAST SQUARES , 1944 .
[213] Vladimir Vapnik,et al. Principles of Risk Minimization for Learning Theory , 1991, NIPS.
[214] Eric Moulines,et al. A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..
[215] Rafal Salustowicz,et al. Probabilistic Incremental Program Evolution , 1997, Evolutionary Computation.
[216] M. Kramer. Nonlinear principal component analysis using autoassociative neural networks , 1991 .
[217] Peter L. Bartlett,et al. Infinite-Horizon Policy-Gradient Estimation , 2001, J. Artif. Intell. Res..
[218] J. Stephen Judd,et al. Neural network design and the complexity of learning , 1990, Neural network modeling and connectionism.
[219] Michael J. Frank,et al. Making Working Memory Work: A Computational Model of Learning in the Prefrontal Cortex and Basal Ganglia , 2006, Neural Computation.
[220] Tom Schaul,et al. Multi-Dimensional Deep Memory Atari-Go Players for Parameter Exploring Policy Gradients , 2010, ICANN.
[221] A. G. Ivakhnenko,et al. Polynomial Theory of Complex Systems , 1971, IEEE Trans. Syst. Man Cybern..
[222] Reinhold Behringer,et al. The seeing passenger car 'VaMoRs-P' , 1994, Proceedings of the Intelligent Vehicles '94 Symposium.
[223] R. Tibshirani,et al. Generalized additive models for medical research , 1986, Statistical methods in medical research.
[224] Sepp Hochreiter,et al. Learning to Learn Using Gradient Descent , 2001, ICANN.
[225] G. Miller. Learning to Forget , 2004, Science.
[226] Jeffrey L. Elman,et al. Learning and Evolution in Neural Networks , 1994, Adapt. Behav..
[227] Wulfram Gerstner,et al. Reduction of the Hodgkin-Huxley Equations to a Single-Variable Threshold Model , 1997, Neural Computation.
[228] Yves Deville,et al. Logic Program Synthesis , 1994, J. Log. Program..
[229] Julian F. Miller,et al. Genetic and Evolutionary Computation — GECCO 2003 , 2003, Lecture Notes in Computer Science.
[230] Danil V. Prokhorov,et al. Enhanced Multi-Stream Kalman Filter Training for Recurrent Networks , 1998 .
[231] C. Lee Giles,et al. Effects of Noise on Convergence and Generalization in Recurrent Networks , 1994, NIPS.
[232] G. Palm,et al. On associative memory , 2004, Biological Cybernetics.
[233] Laurenz Wiskott,et al. Slowness and Sparseness Lead to Place, Head-Direction, and Spatial-View Cells , 2007, PLoS Comput. Biol..
[234] Anton Schwartz,et al. A Reinforcement Learning Method for Maximizing Undiscounted Rewards , 1993, ICML.
[235] Sven Behnke,et al. Evaluation of Pooling Operations in Convolutional Architectures for Object Recognition , 2010, ICANN.
[236] Douglas Aberdeen,et al. Policy-Gradient Algorithms for Partially Observable Markov Decision Processes , 2003 .
[237] P. Werbos. Backwards Differentiation in AD and Neural Nets: Past Links and New Opportunities , 2006 .
[238] Satinder P. Singh,et al. Reinforcement Learning Algorithms for Average-Payoff Markovian Decision Processes , 1994, AAAI.
[239] Terrence J. Sejnowski,et al. Tempering Backpropagation Networks: Not All Weights are Created Equal , 1995, NIPS.
[240] C. E. SHANNON,et al. A mathematical theory of communication , 1948, MOCO.
[241] Robert A. Jacobs,et al. Increased rates of convergence through learning rate adaptation , 1987, Neural Networks.
[242] Tom Schaul,et al. Exponential natural evolution strategies , 2010, GECCO '10.
[243] R. Kurzweil. How to Create a Mind: The Secret of Human Thought Revealed , 2012 .
[244] Long-Ji Lin,et al. Reinforcement learning for robots using neural networks , 1992 .
[245] Ronald J. Williams,et al. Training recurrent networks using the extended Kalman filter , 1992, [Proceedings 1992] IJCNN International Joint Conference on Neural Networks.
[246] Jude W. Shavlik,et al. Combining Symbolic and Neural Learning , 1994, Machine Learning.
[247] Geoffrey E. Hinton,et al. The Recurrent Temporal Restricted Boltzmann Machine , 2008, NIPS.
[248] Harald Haas,et al. Harnessing Nonlinearity: Predicting Chaotic Systems and Saving Energy in Wireless Communication , 2004, Science.
[249] Martin A. Riedmiller,et al. A direct adaptive method for faster backpropagation learning: the RPROP algorithm , 1993, IEEE International Conference on Neural Networks.
[250] Jürgen Schmidhuber,et al. Kalman filters improve LSTM network performance in problems unsolvable by traditional recurrent nets , 2003, Neural Networks.
[251] Nikolaus Hansen,et al. Completely Derandomized Self-Adaptation in Evolution Strategies , 2001, Evolutionary Computation.
[252] T. Kohonen,et al. Self-organizing semantic maps , 1989, Biological Cybernetics.
[253] Dong Yu,et al. Deep Learning: Methods and Applications , 2014, Found. Trends Signal Process..
[254] Yann LeCun,et al. Improving the convergence of back-propagation learning with second-order methods , 1989 .
[255] Halbert White,et al. Learning in Artificial Neural Networks: A Statistical Perspective , 1989, Neural Computation.
[256] Les E. Atlas,et al. Recurrent neural networks and robust time series prediction , 1994, IEEE Trans. Neural Networks.
[257] Garrison W. Cottrell,et al. Non-Linear Dimensionality Reduction , 1992, NIPS.
[258] David E. Rumelhart,et al. Generalization by Weight-Elimination with Application to Forecasting , 1990, NIPS.
[259] Jürgen Schmidhuber,et al. Learning to Generate Artificial Fovea Trajectories for Target Detection , 1991, Int. J. Neural Syst..
[260] M. F. Møller,et al. Exact Calculation of the Product of the Hessian Matrix of Feed-Forward Network Error Functions and a Vector in 0(N) Time , 1993 .
[261] Geoffrey E. Hinton,et al. Generative models for discovering sparse distributed representations. , 1997, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[262] Andrew W. Moore,et al. Gradient Descent for General Reinforcement Learning , 1998, NIPS.
[263] S. Haykin. Kalman Filtering and Neural Networks , 2001 .
[264] Risto Miikkulainen,et al. Evolving Neural Networks through Augmenting Topologies , 2002, Evolutionary Computation.
[265] Rob Fergus,et al. Visualizing and Understanding Convolutional Networks , 2013, ECCV.
[266] Paul Rodríguez,et al. A Recurrent Neural Network that Learns to Count , 1999, Connect. Sci..
[267] Jürgen Schmidhuber,et al. An on-line algorithm for dynamic reinforcement learning and planning in reactive environments , 1990, 1990 IJCNN International Joint Conference on Neural Networks.
[268] L. Abbott,et al. Competitive Hebbian learning through spike-timing-dependent synaptic plasticity , 2000, Nature Neuroscience.
[269] Michael I. Jordan,et al. Reinforcement Learning Algorithm for Partially Observable Markov Decision Problems , 1994, NIPS.
[270] S. Dreyfus. The computational solution of optimal control problems with time lag , 1973 .
[271] Geoffrey E. Hinton,et al. The "wake-sleep" algorithm for unsupervised neural networks. , 1995, Science.
[272] Erkki Oja,et al. Independent component analysis: algorithms and applications , 2000, Neural Networks.
[273] Honglak Lee,et al. Sparse deep belief net model for visual area V2 , 2007, NIPS.
[274] Giovanni Soda,et al. Bidirectional Dynamics for Protein Secondary Structure Prediction , 2001, Sequence Learning.
[275] Józef Korbicz,et al. A GMDH neural network-based approach to robust fault diagnosis : Application to the DAMADICS benchmark problem , 2006 .
[276] Jürgen Schmidhuber,et al. Reinforcement Learning with Self-Modifying Policies , 1998, Learning to Learn.
[277] Douglas B. Lenat,et al. Theory Formation by Heuristic Search , 1983, Artificial Intelligence.
[278] M. Mozer. Discovering Discrete Distributed Representations with Iterative Competitive Learning , 1990, NIPS 1990.
[279] R. Bellman. Dynamic programming. , 1957, Science.
[280] Risto Miikkulainen,et al. Efficient Reinforcement Learning through Symbiotic Evolution , 2004 .
[281] Geoffrey E. Hinton,et al. Reducing the Dimensionality of Data with Neural Networks , 2006, Science.
[282] Grgoire Montavon,et al. Neural Networks: Tricks of the Trade , 2012, Lecture Notes in Computer Science.
[283] D. Perrett,et al. Visual neurones responsive to faces in the monkey temporal cortex , 2004, Experimental Brain Research.
[284] Geoffrey E. Hinton. Training Products of Experts by Minimizing Contrastive Divergence , 2002, Neural Computation.
[285] Kenneth O. Stanley,et al. On the Performance of Indirect Encoding Across the Continuum of Regularity , 2011, IEEE Transactions on Evolutionary Computation.
[286] Sridhar Mahadevan,et al. Recent Advances in Hierarchical Reinforcement Learning , 2003, Discret. Event Dyn. Syst..
[287] Satosi Watanabe,et al. Pattern Recognition: Human and Mechanical , 1985 .
[288] Jürgen Schmidhuber,et al. Sequential Constant Size Compressors for Reinforcement Learning , 2011, AGI.
[289] Stefano Nolfi,et al. How to Evolve Autonomous Robots: Different Approaches in Evolutionary Robotics , 1994 .
[290] C. S. Wallace,et al. An Information Measure for Classification , 1968, Comput. J..
[291] Jürgen Schmidhuber,et al. A System for Robotic Heart Surgery that Learns to Tie Knots Using Recurrent Neural Networks , 2006 .
[292] Nasser M. Nasrabadi,et al. Pattern Recognition and Machine Learning , 2006, Technometrics.
[293] D. Hubel,et al. Receptive fields of single neurones in the cat's striate cortex , 1959, The Journal of physiology.
[294] Paul E. Utgoff,et al. Many-Layered Learning , 2002, Neural Computation.
[295] A. Kolmogorov. Three approaches to the quantitative definition of information , 1968 .
[296] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[297] Jirí Síma,et al. Training a Single Sigmoidal Neuron Is Hard , 2002, Neural Comput..
[298] David Barber,et al. On the Computational Complexity of Stochastic Controller Optimization in POMDPs , 2011, TOCT.
[299] Dana H. Ballard,et al. Modular Learning in Neural Networks , 1987, AAAI.
[300] Steve B. Furber,et al. Modeling Spiking Neural Networks on SpiNNaker , 2010, Computing in Science & Engineering.
[301] R. Kempter,et al. Hebbian learning and spiking neurons , 1999 .
[302] M. C. Jones,et al. Spline Smoothing and Nonparametric Regression. , 1989 .
[303] Tom Schaul,et al. A linear time natural evolution strategy for non-separable functions , 2011, GECCO.
[304] Kuldip K. Paliwal,et al. Bidirectional recurrent neural networks , 1997, IEEE Trans. Signal Process..
[305] R. Vaillant,et al. Original approach for the localisation of objects in images , 1994 .
[306] Barnabás Póczos,et al. Cross-Entropy Optimization for Independent Process Analysis , 2006, ICA.
[307] S. Dreyfus. The numerical solution of variational problems , 1962 .
[308] Ronald J. Williams,et al. Simple Statistical Gradient-Following Algorithms for Connectionist Reinforcement Learning , 2004, Machine Learning.
[309] Yoshinori Sagisaka,et al. Phoneme boundary estimation using bidirectional recurrent neural networks and its applications , 1999 .
[310] Wolfgang Maass,et al. Emergence of complex computational structures from chaotic neural networks through reward-modulated Hebbian learning. , 2014, Cerebral cortex.
[311] Tara N. Sainath,et al. Improving deep neural networks for LVCSR using rectified linear units and dropout , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[312] J. H. Wilkinson. The algebraic eigenvalue problem , 1966 .
[313] Radford M. Neal,et al. High Dimensional Classification with Bayesian Neural Networks and Dirichlet Diffusion Trees , 2006, Feature Extraction.
[314] H. Akaike. Statistical predictor identification , 1970 .
[315] Stanley J. Farlow,et al. Self-Organizing Methods in Modeling: Gmdh Type Algorithms , 1984 .
[316] Andreas Rauber,et al. The growing hierarchical self-organizing map: exploratory analysis of high-dimensional data , 2002, IEEE Trans. Neural Networks.
[317] John N. Tsitsiklis,et al. Feature-based methods for large scale dynamic programming , 2004, Machine Learning.
[318] A. Turing. On Computable Numbers, with an Application to the Entscheidungsproblem. , 1937 .
[319] Jürgen Schmidhuber,et al. Learning Algorithms for Networks with Internal and External Feedback , 1990 .
[320] Terrence J. Sejnowski,et al. Graphical Models: Foundations of Neural Computation , 2001, Pattern Anal. Appl..
[321] Geoffrey E. Hinton,et al. Simplifying Neural Networks by Soft Weight-Sharing , 1992, Neural Computation.
[322] Gert Cauwenberghs,et al. Neuromorphic Silicon Neuron Circuits , 2011, Front. Neurosci.
[323] Geoffrey E. Hinton,et al. Feudal Reinforcement Learning , 1992, NIPS.
[324] Jing Peng,et al. Incremental multi-step Q-learning , 1994, Machine Learning.
[325] Jürgen Schmidhuber,et al. Evolving large-scale neural networks for vision-based reinforcement learning , 2013, GECCO '13.
[326] Derek C. Rose,et al. Deep Machine Learning - A New Frontier in Artificial Intelligence Research [Research Frontier] , 2010, IEEE Computational Intelligence Magazine.
[327] Jürgen Schmidhuber,et al. An intrinsic value system for developing multiple invariant representations with incremental slowness learning , 2013, Front. Neurorobot..
[328] Shigenobu Kobayashi,et al. Reinforcement Learning in POMDPs with Function Approximation , 1997, ICML.
[329] Jordan B. Pollack,et al. Recursive Distributed Representations , 1990, Artif. Intell..
[330] W. Senn,et al. Matching Recall and Storage in Sequence Learning with Spiking Neural Networks , 2013, The Journal of Neuroscience.
[331] Marco Zorzi,et al. Emergence of a 'visual number sense' in hierarchical generative models , 2012, Nature Neuroscience.
[332] Andrew W. Moore,et al. Prioritized Sweeping: Reinforcement Learning with Less Data and Less Time , 1993, Machine Learning.
[333] Pierre Baldi,et al. Deep Architectures and Deep Learning in Chemoinformatics: The Prediction of Aqueous Solubility for Drug-Like Molecules , 2013, J. Chem. Inf. Model..
[334] Geoffrey E. Hinton,et al. Varieties of Helmholtz Machine , 1996, Neural Networks.
[335] Tadashi Kondo,et al. Multi-layered GMDH-type neural network self-selecting optimum neural network architecture and its application to 3-dimensional medical image recognition of blood vessels , 2008 .
[336] Pascal Vincent,et al. Representation Learning: A Review and New Perspectives , 2012, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[337] Davide Anguita,et al. An efficient implementation of BP on RISC-based workstations , 1994, Neurocomputing.
[338] A. S. Weigend,et al. Results of the time series prediction competition at the Santa Fe Institute , 1993, IEEE International Conference on Neural Networks.
[339] Jürgen Schmidhuber,et al. PowerPlay: Training an Increasingly General Problem Solver by Continually Searching for the Simplest Still Unsolvable Problem , 2011, Front. Psychol..
[340] Shun-ichi Amari,et al. Natural Gradient Works Efficiently in Learning , 1998, Neural Computation.
[341] Bart Kosko,et al. Unsupervised learning in noise , 1990, International 1989 Joint Conference on Neural Networks.
[342] Jürgen Schmidhuber,et al. A committee of neural networks for traffic sign classification , 2011, The 2011 International Joint Conference on Neural Networks.
[343] John H. Holland,et al. Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence , 1992 .
[344] C. Lee Giles,et al. Extraction of rules from discrete-time recurrent neural networks , 1996, Neural Networks.
[345] Robert Balzer,et al. A 15 Year Perspective on Automatic Programming , 1985, IEEE Transactions on Software Engineering.
[346] Saburo Ikeda,et al. Sequential GMDH Algorithm and Its Application to River Flow Prediction , 1976, IEEE Transactions on Systems, Man, and Cybernetics.
[347] D I Perrett,et al. Organization and functions of cells responsive to faces in the temporal cortex. , 1992, Philosophical transactions of the Royal Society of London. Series B, Biological sciences.
[348] H. Sebastian Seung,et al. Natural Image Denoising with Convolutional Networks , 2008, NIPS.
[349] Jürgen Schmidhuber,et al. Accelerated learning in back-propagation nets , 1989 .
[350] Jürgen Schmidhuber,et al. Incremental Slow Feature Analysis: Adaptive Low-Complexity Slow Feature Updating from High-Dimensional Input Streams , 2012, Neural Computation.
[351] Naonori Ueda,et al. Optimal Linear Combination of Neural Networks for Improving Classification Performance , 2000, IEEE Trans. Pattern Anal. Mach. Intell..
[352] Peter Craven,et al. Smoothing noisy data with spline functions , 1978 .
[353] Nicolas Brunel,et al. Dynamics of a recurrent network of spiking neurons before and following learning , 1997 .
[354] Johannes Schemmel,et al. Implementing Synaptic Plasticity in a VLSI Spiking Neural Network Model , 2006, The 2006 IEEE International Joint Conference on Neural Network Proceedings.
[355] Jürgen Schmidhuber,et al. A robot that reinforcement-learns to identify and memorize important previous observations , 2003, Proceedings 2003 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS 2003) (Cat. No.03CH37453).
[356] Eduardo Sontag,et al. Turing computability with neural nets , 1991 .
[357] Guozhong An,et al. The Effects of Adding Noise During Backpropagation Training on a Generalization Performance , 1996, Neural Computation.
[358] Geoffrey E. Hinton,et al. Learning internal representations by error propagation , 1986 .
[359] Fred Henrik Hamker,et al. Learning Invariance from Natural Images Inspired by Observations in the Primary Visual Cortex , 2012, Neural Computation.
[360] R. FitzHugh. Impulses and Physiological States in Theoretical Models of Nerve Membrane. , 1961, Biophysical journal.
[361] Stefano Nolfi,et al. Evolving mobile robots in simulated and real environments , 1995 .
[362] Kenneth O. Stanley,et al. A Hypercube-Based Encoding for Evolving Large-Scale Neural Networks , 2009, Artificial Life.
[363] Kazumi Saito,et al. Partial BFGS Update and Efficient Step-Length Calculation for Three-Layer Neural Networks , 1997, Neural Computation.
[364] Robert Babuska,et al. A Survey of Actor-Critic Reinforcement Learning: Standard and Natural Policy Gradients , 2012, IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews).
[365] Luca Maria Gambardella,et al. Fast image scanning with deep max-pooling convolutional neural networks , 2013, 2013 IEEE International Conference on Image Processing.
[366] Robert A. Legenstein,et al. Reinforcement Learning on Slow Features of High-Dimensional Input Streams , 2010, PLoS Comput. Biol..
[367] Risto Miikkulainen,et al. Active Guidance for a Finless Rocket Using Neuroevolution , 2003, GECCO.
[368] Andrzej Cichocki,et al. Neural networks for optimization and signal processing , 1993 .
[369] Kumpati S. Narendra,et al. Learning Automata - A Survey , 1974, IEEE Trans. Syst. Man Cybern..
[370] Wolfgang Maass,et al. Emergence of Dynamic Memory Traces in Cortical Microcircuit Models through STDP , 2013, The Journal of Neuroscience.
[371] Geoffrey E. Hinton,et al. Learning and relearning in Boltzmann machines , 1986 .
[372] James L. McClelland,et al. Parallel distributed processing: explorations in the microstructure of cognition, vol. 1: foundations , 1986 .
[373] M. Stemmler. A single spike suffices: the simplest form of stochastic resonance in model neurons , 1996 .
[374] Matthew D. Zeiler. ADADELTA: An Adaptive Learning Rate Method , 2012, ArXiv.
[375] David H. Wolpert,et al. Bayesian Backpropagation Over I-O Functions Rather Than Weights , 1993, NIPS.
[376] Luca Maria Gambardella,et al. Flexible, High Performance Convolutional Neural Networks for Image Classification , 2011, IJCAI.
[377] T. Poggio,et al. Hierarchical models of object recognition in cortex , 1999, Nature Neuroscience.
[378] Jan Peters. Policy gradient methods , 2010, Scholarpedia.
[379] F. Vallet,et al. Robustness in Multilayer Perceptrons , 1993, Neural Computation.
[380] Marc'Aurelio Ranzato,et al. Unsupervised Learning of Invariant Feature Hierarchies with Applications to Object Recognition , 2007, 2007 IEEE Conference on Computer Vision and Pattern Recognition.
[381] Peter Stone,et al. Policy gradient reinforcement learning for fast quadrupedal locomotion , 2004, IEEE International Conference on Robotics and Automation, 2004. Proceedings. ICRA '04. 2004.
[382] Nuttapong Chentanez,et al. Intrinsically Motivated Reinforcement Learning , 2004, NIPS.
[383] Geoffrey E. Hinton,et al. Speech recognition with deep recurrent neural networks , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[384] Sridhar Mahadevan,et al. Hierarchical Policy Gradient Algorithms , 2003, ICML.
[385] Erkki Oja,et al. Neural Networks, Principal Components, and Subspaces , 1989, Int. J. Neural Syst..
[386] Alex Graves,et al. Practical Variational Inference for Neural Networks , 2011, NIPS.
[387] Volkmar Frinken,et al. Long-short term memory neural networks language modeling for handwriting recognition , 2012, Proceedings of the 21st International Conference on Pattern Recognition (ICPR2012).
[388] Alan S. Lapedes,et al. A self-optimizing, nonsymmetrical neural net for content addressable memory and pattern recognition , 1986 .
[389] José Carlos Príncipe,et al. A Theory for Neural Networks with Time Delays , 1990, NIPS.
[390] Eric Saund,et al. Unsupervised Learning of Mixtures of Multiple Causes in Binary Data , 1993, NIPS.
[391] Jürgen Schmidhuber,et al. A local learning algorithm for dynamic feedforward and recurrent networks , 1990, Forschungsberichte, TU Munich.
[392] Yann LeCun,et al. A theoretical framework for back-propagation , 1988 .
[393] Henry Markram,et al. Real-Time Computing Without Stable States: A New Framework for Neural Computation Based on Perturbations , 2002, Neural Computation.
[394] Stephen Grossberg,et al. Adaptive pattern classification and universal recoding: II. Feedback, expectation, olfaction, illusions , 1976, Biological Cybernetics.
[395] Wolfgang Maass,et al. Bayesian Computation Emerges in Generic Cortical Microcircuits through Spike-Timing-Dependent Plasticity , 2013, PLoS Comput. Biol..
[396] Volkmar Frinken,et al. Mode Detection in Online Handwritten Documents Using BLSTM Neural Networks , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.
[397] Gregory J. Chaitin,et al. On the Length of Programs for Computing Finite Binary Sequences , 1966, JACM.
[398] Tobi Delbrück,et al. CAVIAR: A 45k Neuron, 5M Synapse, 12G Connects/s AER Hardware Sensory–Processing– Learning–Actuating System for High-Speed Visual Object Recognition and Tracking , 2009, IEEE Transactions on Neural Networks.
[399] Sridhar Mahadevan,et al. Average reward reinforcement learning: Foundations, algorithms, and empirical results , 2004, Machine Learning.
[400] Kee-Eung Kim,et al. Learning Finite-State Controllers for Partially Observable Environments , 1999, UAI.
[401] Alekseĭ Grigorʹevich Ivakhnenko,et al. CYBERNETIC PREDICTING DEVICES , 1966 .
[402] Tom Schaul,et al. Natural Evolution Strategies , 2008, 2008 IEEE Congress on Evolutionary Computation (IEEE World Congress on Computational Intelligence).
[403] Jürgen Schmidhuber,et al. My First Deep Learning System of 1991 + Deep Learning Timeline 1962-2013 , 2013, ArXiv.
[404] Richard S. Sutton,et al. Reinforcement Learning: An Introduction , 1998, IEEE Trans. Neural Networks.
[405] Pierre Baldi,et al. Hybrid Modeling, HMM/NN Architectures, and Protein Applications , 1996, Neural Computation.
[406] Jürgen Schmidhuber,et al. Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks , 2007, NIPS.
[407] Karl Sims,et al. Evolving virtual creatures , 1994, SIGGRAPH.
[408] M. Stone. Cross‐Validatory Choice and Assessment of Statistical Predictions , 1976 .
[409] Thomas G. Dietterich. Multiple Classifier Systems , 2000, Lecture Notes in Computer Science.
[410] L. Baum,et al. Statistical Inference for Probabilistic Functions of Finite State Markov Chains , 1966 .
[411] Trevor Darrell,et al. DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition , 2013, ICML.
[412] Volkmar Frinken,et al. Keyword Spotting in Online Handwritten Documents Containing Text and Non-text Using BLSTM Neural Networks , 2011, 2011 International Conference on Document Analysis and Recognition.
[413] Alexander H. Waibel,et al. The Tempo 2 Algorithm: Adjusting Time-Delays By Supervised Learning , 1990, NIPS.
[414] Joachim Diederich,et al. Survey and critique of techniques for extracting rules from trained artificial neural networks , 1995, Knowl. Based Syst..
[415] C. G. Broyden. A Class of Methods for Solving Nonlinear Simultaneous Equations , 1965 .
[416] Christian Osendorfer,et al. On Fast Dropout and its Applicability to Recurrent Networks , 2013, ICLR.
[417] Julian Togelius,et al. The 2009 Simulated Car Racing Championship , 2010, IEEE Transactions on Computational Intelligence and AI in Games.
[418] Barak A. Pearlmutter. Learning State Space Trajectories in Recurrent Neural Networks , 1989, Neural Computation.
[419] Bruce W. Schmeiser,et al. Improving model accuracy using optimal linear combinations of trained neural networks , 1995, IEEE Trans. Neural Networks.
[420] Jude W. Shavlik,et al. Combining the Predictions of Multiple Classifiers: Using Competitive Learning to Initialize Neural Networks , 1995, IJCAI.
[421] Lorien Y. Pratt,et al. Comparing Biases for Minimal Network Construction with Back-Propagation , 1988, NIPS.
[422] Jürgen Schmidhuber,et al. Transfer learning for Latin and Chinese characters with Deep Neural Networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).
[423] Jürgen Schmidhuber,et al. Gödel Machines: Fully Self-referential Optimal Universal Self-improvers , 2007, Artificial General Intelligence.
[424] Arthur L. Samuel,et al. Some studies in machine learning using the game of checkers , 2000, IBM J. Res. Dev..
[425] Tom Schaul,et al. The two-dimensional organization of behavior , 2011, 2011 IEEE International Conference on Development and Learning (ICDL).
[426] Aude Billard,et al. From Animals to Animats , 2004 .
[427] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[428] Yoshua Bengio,et al. Hierarchical Recurrent Neural Networks for Long-Term Dependencies , 1995, NIPS.
[429] Helge J. Ritter,et al. Three-dimensional neural net for learning visuomotor coordination of a robot arm , 1990, IEEE Trans. Neural Networks.
[430] Mark A. Pitt,et al. Advances in Minimum Description Length: Theory and Applications , 2005 .
[431] Paul J. Werbos,et al. Applications of advances in nonlinear sensitivity analysis , 1982 .
[432] J. Schmidhuber. An 'introspective' network that can learn to run its own weight change algorithm , 1993 .
[433] David Haussler,et al. Occam's Razor , 1987, Inf. Process. Lett..
[434] Stefan Schaal,et al. Natural Actor-Critic , 2003, Neurocomputing.
[435] Christopher D. Manning,et al. Fast dropout training , 2013, ICML.
[436] D. Marquardt. An Algorithm for Least-Squares Estimation of Nonlinear Parameters , 1963 .
[437] David J. Field,et al. Emergence of simple-cell receptive field properties by learning a sparse code for natural images , 1996, Nature.
[438] Astro Teller,et al. The evolution of mental models , 1994 .
[439] A. Church. An Unsolvable Problem of Elementary Number Theory , 1936 .
[440] Yaroslav Bulatov,et al. Multi-digit Number Recognition from Street View Imagery using Deep Convolutional Neural Networks , 2013, ICLR.
[441] David E. Goldberg,et al. Genetic Algorithms in Search Optimization and Machine Learning , 1988 .
[442] Shih-Chii Liu,et al. Minitaur, an Event-Driven FPGA-Based Spiking Network Accelerator , 2014, IEEE Transactions on Very Large Scale Integration (VLSI) Systems.
[443] Ming Yang,et al. 3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[444] A. Norman Redlich,et al. Redundancy Reduction as a Strategy for Unsupervised Learning , 1993, Neural Computation.
[445] Oren Etzioni,et al. Explanation-Based Learning: A Problem Solving Perspective , 1989, Artif. Intell..
[446] Eugene M. Izhikevich,et al. Simple model of spiking neurons , 2003, IEEE Trans. Neural Networks.
[447] Geoffrey E. Hinton,et al. A time-delay neural network architecture for isolated word recognition , 1990, Neural Networks.
[448] Yee Whye Teh,et al. A Fast Learning Algorithm for Deep Belief Nets , 2006, Neural Computation.
[449] F. Pasemann,et al. Evolving structure and function of neurocontrollers , 1999, Proceedings of the 1999 Congress on Evolutionary Computation-CEC99 (Cat. No. 99TH8406).
[450] Wofgang Maas,et al. Networks of spiking neurons: the third generation of neural network models , 1997 .
[451] Frank Bärmann,et al. A learning algorithm for multilayered neural networks based on linear least squares problems , 1993, Neural Networks.
[452] Michael J. Carter,et al. Operational Fault Tolerance of CMAC Networks , 1989, NIPS.
[453] Pierre Comon,et al. Independent component analysis, A new concept? , 1994, Signal Process..
[454] Jordan B. Pollack,et al. RAAM for infinite context-free languages , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[455] A. Hodgkin,et al. A quantitative description of membrane current and its application to conduction and excitation in nerve , 1952, The Journal of physiology.
[456] Jürgen Schmidhuber,et al. Learning Factorial Codes by Predictability Minimization , 1992, Neural Computation.
[457] Jürgen Schmidhuber,et al. Compete to Compute , 2013, NIPS.
[458] Pierre Baldi,et al. Neural Networks for Fingerprint Recognition , 1993, Neural Computation.
[459] Yoshua Bengio,et al. Extracting and composing robust features with denoising autoencoders , 2008, ICML '08.
[460] Randall C. O'Reilly,et al. Biologically Plausible Error-Driven Learning Using Local Activation Differences: The Generalized Recirculation Algorithm , 1996, Neural Computation.
[461] Jason Weston,et al. A unified architecture for natural language processing: deep neural networks with multitask learning , 2008, ICML '08.
[462] Jürgen Schmidhuber,et al. Flat Minima , 1997, Neural Computation.
[463] Jürgen Schmidhuber,et al. Evolving neural networks in compressed weight space , 2010, GECCO '10.
[464] Schuster Hg. Learning by maximizing the information transfer through nonlinear noisy neurons and "noise breakdown , 1992 .
[465] Geoffrey E. Hinton,et al. Keeping Neural Networks Simple , 1993 .
[466] C. Malsburg. Self-organization of orientation sensitive cells in the striate cortex , 2004, Kybernetik.
[467] Ansgar Heinrich Ludolf West,et al. Adaptive Back-Propagation in On-Line Learning of Multilayer Networks , 1995, NIPS.
[468] S. Grossberg,et al. Adaptive pattern classification and universal recoding: I. Parallel development and coding of neural feature detectors , 1976, Biological Cybernetics.
[469] Yoshua. Bengio,et al. Learning Deep Architectures for AI , 2007, Found. Trends Mach. Learn..
[470] Anitha Pasupathy,et al. Transformation of shape information in the ventral pathway , 2007, Current Opinion in Neurobiology.
[471] Danil V. Prokhorov,et al. Adaptive behavior with fixed weights in RNN: an overview , 2002, Proceedings of the 2002 International Joint Conference on Neural Networks. IJCNN'02 (Cat. No.02CH37290).
[472] D. O. Hebb,et al. The organization of behavior , 1988 .
[473] S. Linnainmaa. Taylor expansion of the accumulated rounding error , 1976 .
[474] Terence D. Sanger,et al. An Optimality Principle for Unsupervised Learning , 1988, NIPS.
[475] Camille Couprie,et al. Learning Hierarchical Features for Scene Labeling , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[476] Qingxiang Wu,et al. A Novel Approach for the Implementation of Large Scale Spiking Neural Networks on FPGA Hardware , 2005, IWANN.
[477] L. Bobrowski. Learning processes in multilayer threshold nets , 1978, Biological Cybernetics.
[478] Jürgen Schmidhuber,et al. Sequence Labelling in Structured Domains with Hierarchical Recurrent Neural Networks , 2007, IJCAI.
[479] J. Nadal,et al. Nonlinear neurons in the low-noise limit: a factorial code maximizes information transfer Network 5 , 1994 .
[480] Ronald J. Williams,et al. Experimental Analysis of the Real-time Recurrent Learning Algorithm , 1989 .
[481] Inman Harvey,et al. Evolving Recurrent Dynamical Networks for Robot Control , 1993 .
[482] M. Gherrity,et al. A learning algorithm for analog, fully recurrent neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[483] Pineda,et al. Generalization of back-propagation to recurrent neural networks. , 1987, Physical review letters.
[484] C. Malsburg,et al. How patterned neural connections can be set up by self-organization , 1976, Proceedings of the Royal Society of London. Series B. Biological Sciences.
[485] Teuvo Kohonen,et al. Correlation Matrix Memories , 1972, IEEE Transactions on Computers.
[486] J. Rissanen. Stochastic Complexity and Modeling , 1986 .
[487] Tadashi Kondo,et al. GMDH neural network algorithm using the heuristic self-organization method and its application to the pattern identification problem , 1998, Proceedings of the 37th SICE Annual Conference. International Session Papers.
[488] Pierre-Yves Oudeyer,et al. Intrinsically Motivated Learning of Real-World Sensorimotor Skills with Developmental Constraints , 2013, Intrinsically Motivated Learning in Natural and Artificial Systems.
[489] Barak A. Pearlmutter. Fast Exact Multiplication by the Hessian , 1994, Neural Computation.
[490] D. Hubel,et al. Receptive fields, binocular interaction and functional architecture in the cat's visual cortex , 1962, The Journal of physiology.
[491] Goldberg,et al. Genetic algorithms , 1993, Robust Control Systems with Genetic Algorithms.
[492] Kunihiko Fukushima,et al. Increasing robustness against background noise: Visual pattern recognition by a neocognitron , 2011, Neural Networks.
[493] Barak A. Pearlmutter,et al. G-maximization: An unsupervised learning procedure for discovering regularities , 1987 .
[494] Mitsuo Kawato,et al. Inter-module credit assignment in modular reinforcement learning , 2003, Neural Networks.
[495] Justus H. Piater,et al. Closed-Loop Learning of Visual Control Policies , 2011, J. Artif. Intell. Res..
[496] Marc'Aurelio Ranzato,et al. Building high-level features using large scale unsupervised learning , 2011, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.
[497] KD Miller. A model for the development of simple cell receptive fields and the ordered arrangement of orientation columns through activity-dependent competition between ON- and OFF-center inputs , 1994, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[498] Tom Schaul,et al. No more pesky learning rates , 2012, ICML.
[499] Dong Yu,et al. Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition , 2012, IEEE Transactions on Audio, Speech, and Language Processing.
[500] Nuttapong Chentanez,et al. Intrinsically Motivated Learning of Hierarchical Collections of Skills , 2004 .
[501] Tao Wang,et al. Deep learning with COTS HPC systems , 2013, ICML.
[502] Nils J. Nilsson,et al. Principles of Artificial Intelligence , 1980, IEEE Transactions on Pattern Analysis and Machine Intelligence.
[503] Jürgen Schmidhuber,et al. Unsupervised Learning in LSTM Recurrent Neural Networks , 2001, ICANN.
[504] Jing Peng,et al. An Efficient Gradient-Based Algorithm for On-Line Training of Recurrent Network Trajectories , 1990, Neural Computation.
[505] Richard C. T. Lee,et al. PROW: A Step Toward Automatic Program Writing , 1969, IJCAI.
[506] D. Gabor,et al. Theory of communication. Part 1: The analysis of information , 1946 .
[507] Nichael Lynn Cramer,et al. A Representation for the Adaptive Generation of Simple Sequential Programs , 1985, ICGA.
[508] Hervé Bourlard,et al. Connectionist Speech Recognition: A Hybrid Approach , 1993 .
[509] Jochen J. Steil,et al. Online reservoir adaptation by intrinsic plasticity for backpropagation-decorrelation and echo state learning , 2007, Neural Networks.
[510] Jianlin Cheng,et al. NNcon: improved protein contact map prediction using 2D-recursive neural networks , 2009, Nucleic Acids Res..
[511] Keiji Tanaka,et al. Neuronal selectivities to complex object features in the ventral visual pathway of the macaque cerebral cortex. , 1994, Journal of neurophysiology.
[512] Luca Maria Gambardella,et al. Deep Neural Networks Segment Neuronal Membranes in Electron Microscopy Images , 2012, NIPS.
[513] marquis de L'Hospital. Analyse des infiniment petits, pour l'intelligence des lignes courbes , 1970 .
[514] Hans-Georg Zimmermann,et al. Forecasting with Recurrent Neural Networks: 12 Tricks , 2012, Neural Networks: Tricks of the Trade.
[515] Jürgen Schmidhuber,et al. Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks , 2008, NIPS.
[516] Julian Togelius,et al. Evolving Memory Cell Structures for Sequence Learning , 2009, ICANN.
[517] Shimon Whiteson,et al. Evolutionary Function Approximation for Reinforcement Learning , 2006, J. Mach. Learn. Res..
[518] K. Gödel. Über formal unentscheidbare Sätze der Principia Mathematica und verwandter Systeme I , 1931 .
[519] Jürgen Schmidhuber,et al. Curious model-building control systems , 1991, [Proceedings] 1991 IEEE International Joint Conference on Neural Networks.
[520] M. A. Andrade,et al. Evaluation of secondary structure of proteins from UV circular dichroism spectra using an unsupervised learning neural network. , 1993, Protein engineering.
[521] Hiroaki Kitano,et al. Designing Neural Networks Using Genetic Algorithms with Graph Generation System , 1990, Complex Syst..
[522] Tara N. Sainath,et al. Deep Neural Networks for Acoustic Modeling in Speech Recognition: The Shared Views of Four Research Groups , 2012, IEEE Signal Processing Magazine.
[523] Elliot Soloway,et al. Learning to program = learning to construct mechanisms and explanations , 1986, CACM.
[524] Qiang Yang,et al. A Survey on Transfer Learning , 2010, IEEE Transactions on Knowledge and Data Engineering.
[525] Gerald DeJong,et al. Explanation-Based Learning: An Alternative View , 2005, Machine Learning.
[526] Peter Tiño,et al. Learning long-term dependencies in NARX recurrent neural networks , 1996, IEEE Trans. Neural Networks.
[527] Jürgen Schmidhuber,et al. State-Dependent Exploration for Policy Gradient Methods , 2008, ECML/PKDD.
[528] Jürgen Schmidhuber,et al. Semilinear Predictability Minimization Produces Well-Known Feature Detectors , 1996, Neural Computation.
[529] Barak A. Pearlmutter. Gradient calculations for dynamic recurrent neural networks: a survey , 1995, IEEE Trans. Neural Networks.
[530] L. C. Baird,et al. Reinforcement learning in continuous time: advantage updating , 1994, Proceedings of 1994 IEEE International Conference on Neural Networks (ICNN'94).
[531] Mike Schuster,et al. On supervised learning from sequential data with applications for speech regognition , 1999 .
[532] Ian H. Witten,et al. Stacked generalization: when does it work? , 1997, IJCAI 1997.
[533] Jürgen Schmidhuber,et al. Solving POMDPs with Levin Search and EIRA , 1996, ICML.
[534] D. Zipser,et al. A spiking network model of short-term active memory , 1993, The Journal of neuroscience : the official journal of the Society for Neuroscience.
[535] Daniele Loiacono,et al. Simulated Car Racing Championship: Competition Software Manual , 2013, ArXiv.
[536] Nicolas Brunel,et al. Dynamics of Sparsely Connected Networks of Excitatory and Inhibitory Spiking Neurons , 2000, Journal of Computational Neuroscience.
[537] B. Widrow,et al. The truck backer-upper: an example of self-learning in neural networks , 1989, International 1989 Joint Conference on Neural Networks.
[538] Christian Jacob,et al. Genetic L-System Programming , 1994, PPSN.
[539] J. Rubner,et al. Development of feature detectors by self-organization , 2004, Biological Cybernetics.
[540] Mitsuo Kawato,et al. Neural network control for a closed-loop System using Feedback-error-learning , 1993, Neural Networks.
[541] D. Hubel,et al. Receptive fields and functional architecture of monkey striate cortex , 1968, The Journal of physiology.
[542] Luca Maria Gambardella,et al. Learing Fine Motion by Using the Hierarchical Extended Kohonen Map , 1996, ICANN.
[543] Xiang Zhang,et al. OverFeat: Integrated Recognition, Localization and Detection using Convolutional Networks , 2013, ICLR.
[544] Rajat Raina,et al. Efficient sparse coding algorithms , 2006, NIPS.
[545] Jun Morimoto,et al. Robust Reinforcement Learning , 2005, Neural Computation.
[546] A. P. Wieland,et al. Evolving neural network controllers for unstable systems , 1991, IJCNN-91-Seattle International Joint Conference on Neural Networks.
[547] R. L. Stratonovich. CONDITIONAL MARKOV PROCESSES , 1960 .
[548] Frédéric Jurie,et al. Sampling Strategies for Bag-of-Features Image Classification , 2006, ECCV.
[549] Achilleas Zapranis,et al. Stock performance modeling using neural networks: A comparative study with regression models , 1994, Neural Networks.
[550] Pierre Baldi,et al. The dropout learning algorithm , 2014, Artif. Intell..
[551] R. E. Kalman,et al. A New Approach to Linear Filtering and Prediction Problems , 2002 .
[552] Dimitri P. Bertsekas,et al. Dynamic Programming and Optimal Control, Two Volume Set , 1995 .
[553] John E. Moody,et al. Fast Pruning Using Principal Components , 1993, NIPS.
[554] David J. Field,et al. What Is the Goal of Sensory Coding? , 1994, Neural Computation.
[555] Jürgen Schmidhuber,et al. Self-Delimiting Neural Networks , 2012, ArXiv.
[556] Stephen A. Cook,et al. The complexity of theorem-proving procedures , 1971, STOC.
[557] L. S. Pontryagin,et al. Mathematical Theory of Optimal Processes , 1962 .
[558] Stephen F. Smith,et al. A learning system based on genetic adaptive algorithms , 1980 .
[559] Gert Cauwenberghs,et al. A Fast Stochastic Error-Descent Algorithm for Supervised Learning and Optimization , 1992, NIPS.
[560] Andrzej Cichocki,et al. A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.
[561] Paul J. Werbos,et al. Building and Understanding Adaptive Systems: A Statistical/Numerical Approach to Factory Automation and Brain Research , 1987, IEEE Transactions on Systems, Man, and Cybernetics.
[562] John F. Kolen,et al. Field Guide to Dynamical Recurrent Networks , 2001 .
[563] Tim Curran,et al. The Limits of Feedforward Vision: Recurrent Processing Promotes Robust Object Recognition when Objects Are Degraded , 2012, Journal of Cognitive Neuroscience.
[564] A. E. Bryson,et al. A Steepest-Ascent Method for Solving Optimum Programming Problems , 1962 .
[565] Stefan Carlsson,et al. CNN Features Off-the-Shelf: An Astounding Baseline for Recognition , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition Workshops.
[566] Wray L. Buntine,et al. Bayesian Back-Propagation , 1991, Complex Syst..
[567] B. McNaughton,et al. Population dynamics and theta rhythm phase precession of hippocampal place cell firing: A spiking neuron model , 1998, Hippocampus.
[568] Keechul Jung,et al. GPU implementation of neural networks , 2004, Pattern Recognit..
[569] Kyunghyun Cho,et al. Foundations and Advances in Deep Learning , 2014 .
[570] Bram Bakker,et al. Hierarchical Reinforcement Learning Based on Subgoal Discovery and Subpolicy Specialization , 2003 .
[571] Jürgen Schmidhuber,et al. Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.
[572] Günther Palm,et al. On the Information Storage Capacity of Local Learning Rules , 1992, Neural Computation.
[573] Scott E. Fahlman,et al. An empirical study of learning speed in back-propagation networks , 1988 .
[574] Nicole Immorlica,et al. Locality-sensitive hashing scheme based on p-stable distributions , 2004, SCG '04.
[575] Pierre Baldi,et al. The Principled Design of Large-Scale Recursive Neural Network Architectures--DAG-RNNs and the Protein Structure Prediction Problem , 2003, J. Mach. Learn. Res..
[576] Jeremy Buhler,et al. Efficient large-scale sequence comparison by locality-sensitive hashing , 2001, Bioinform..
[577] Jürgen Schmidhuber,et al. On Fast Deep Nets for AGI Vision , 2011, AGI.
[578] P. J. Werbos,et al. Backpropagation and neurocontrol: a review and prospectus , 1989, International 1989 Joint Conference on Neural Networks.
[579] J J Hopfield,et al. Neural networks and physical systems with emergent collective computational abilities. , 1982, Proceedings of the National Academy of Sciences of the United States of America.
[580] Gomes de Freitas,et al. Bayesian methods for neural networks , 2000 .
[581] Tom Schaul,et al. Efficient natural evolution strategies , 2009, GECCO.
[582] Tapani Raiko,et al. Enhanced Gradient for Training Restricted Boltzmann Machines , 2013, Neural Computation.
[583] Panagiotis Manolios,et al. First-Order Recurrent Neural Networks and Deterministic Finite State Automata , 1994, Neural Computation.
[584] Michael I. Jordan. Supervised learning and systems with excess degrees of freedom , 1988 .
[585] Henry Markram,et al. Neural Networks with Dynamic Synapses , 1998, Neural Computation.
[586] Jirí Síma,et al. Loading Deep Networks Is Hard , 1994, Neural Comput..
[587] Richard S. Sutton,et al. GQ(lambda): A general gradient algorithm for temporal-difference prediction learning with eligibility traces , 2010, Artificial General Intelligence.
[588] Osamu Watanabe,et al. Kolmogorov Complexity and Computational Complexity , 2012, EATCS Monographs on Theoretical Computer Science.
[589] Jürgen Schmidhuber,et al. A fast learning algorithm for image segmentation with max-pooling convolutional networks , 2013, 2013 IEEE International Conference on Image Processing.
[590] Christian Jutten,et al. Blind separation of sources, part I: An adaptive algorithm based on neuromimetic architecture , 1991, Signal Process..
[591] F ROSENBLATT,et al. The perceptron: a probabilistic model for information storage and organization in the brain. , 1958, Psychological review.
[592] Ming Li,et al. An Introduction to Kolmogorov Complexity and Its Applications , 2019, Texts in Computer Science.
[593] Yoshua Bengio,et al. Learning long-term dependencies with gradient descent is difficult , 1994, IEEE Trans. Neural Networks.
[594] Jürgen Schmidhuber,et al. Co-evolving recurrent neurons learn deep memory POMDPs , 2005, GECCO '05.
[595] Faustino J. Gomez,et al. Intrinsically Motivated Evolutionary Search for Vision-Based Reinforcement Learning , 2011 .
[596] Christian Igel,et al. Neuroevolution for reinforcement learning using evolution strategies , 2003, The 2003 Congress on Evolutionary Computation, 2003. CEC '03..
[597] Helko Lehmann,et al. Computation in Recurrent Neural Networks: From Counters to Iterated Function Systems , 1998, Australian Joint Conference on Artificial Intelligence.
[598] Razvan Pascanu,et al. On the difficulty of training recurrent neural networks , 2012, ICML.
[599] Bruno A. Olshausen,et al. Inferring Sparse, Overcomplete Image Codes Using an Efficient Coding Framework , 1998, NIPS.
[600] BattitiRoberto. First- and second-order methods for learning , 1992 .
[601] Ronald L. Rivest,et al. Training a 3-node neural network is NP-complete , 1988, COLT '88.
[602] A. A. Mullin,et al. Principles of neurodynamics , 1962 .
[603] Tao Zhang,et al. Stable Adaptive Neural Network Control , 2001, The Springer International Series on Asian Studies in Computer and Information Science.
[604] Leslie Pack Kaelbling,et al. Learning Policies for Partially Observable Environments: Scaling Up , 1997, ICML.
[605] Peter Norvig,et al. Artificial Intelligence: A Modern Approach , 1995 .
[606] Babak Hassibi,et al. Second Order Derivatives for Network Pruning: Optimal Brain Surgeon , 1992, NIPS.
[607] Elie Bienenstock,et al. Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.
[608] Ronen I. Brafman,et al. R-MAX - A General Polynomial Time Algorithm for Near-Optimal Reinforcement Learning , 2001, J. Mach. Learn. Res..
[609] Pierre Baldi,et al. Deep architectures for protein contact map prediction , 2012, Bioinform..
[610] Geoffrey E. Hinton,et al. Phone recognition using Restricted Boltzmann Machines , 2010, 2010 IEEE International Conference on Acoustics, Speech and Signal Processing.
[611] Steven J. Bradtke,et al. Linear Least-Squares algorithms for temporal difference learning , 2004, Machine Learning.
[612] P. Werbos,et al. Beyond Regression : "New Tools for Prediction and Analysis in the Behavioral Sciences , 1974 .
[613] Tony Plate,et al. Holographic Recurrent Networks , 1992, NIPS.
[614] H. Akaike. A new look at the statistical model identification , 1974 .
[615] Roger Fletcher,et al. A Rapidly Convergent Descent Method for Minimization , 1963, Comput. J..
[616] Vittorio Maniezzo,et al. Genetic evolution of the topology and weight distribution of neural networks , 1994, IEEE Trans. Neural Networks.
[617] Ronald J. Williams,et al. A Learning Algorithm for Continually Running Fully Recurrent Neural Networks , 1989, Neural Computation.
[618] Yoshua Bengio,et al. Unsupervised and Transfer Learning Challenge: a Deep Learning Approach , 2011, ICML Unsupervised and Transfer Learning.
[619] Robert A. Legenstein,et al. Neural circuits for pattern recognition with small total wire length , 2002, Theor. Comput. Sci..
[620] Andrew G. Barto,et al. Skill Characterization Based on Betweenness , 2008, NIPS.
[621] Nitish Srivastava,et al. Improving neural networks by preventing co-adaptation of feature detectors , 2012, ArXiv.
[622] Joseph F. Murray,et al. Convolutional Networks Can Learn to Generate Affinity Graphs for Image Segmentation , 2010, Neural Computation.
[623] Pierre Baldi,et al. Autoencoders, Unsupervised Learning, and Deep Architectures , 2011, ICML Unsupervised and Transfer Learning.
[624] Luís B. Almeida,et al. A learning rule for asynchronous perceptrons with feedback in a combinatorial environment , 1990 .
[625] Yoshua Bengio,et al. Artificial neural networks and their application to sequence recognition , 1991 .
[626] Andrés Pérez Uribe,et al. Structure-Adaptable Digital Neural Networks , 1999 .
[627] Yann LeCun,et al. Traffic sign recognition with multi-scale Convolutional Networks , 2011, The 2011 International Joint Conference on Neural Networks.
[628] Reinhard Männer,et al. Multiprocessor And Memory Architecture Of The Neurocomputer Synapse-1 , 1993, Int. J. Neural Syst..
[629] Dario Floreano,et al. Hardware spiking neural network with run-time reconfigurable connectivity in an autonomous robot , 2003, NASA/DoD Conference on Evolvable Hardware, 2003. Proceedings..
[630] Dario Floreano,et al. From Animals to Animats 2: Proceedings of the Second International Conference on Simulation of Adaptive Behavior , 2000, Journal of Cognitive Neuroscience.
[631] D. Mackay,et al. Analysis of Linsker's application of Hebbian rules to linear networks , 1990 .
[632] John E. Moody,et al. Fast Learning in Multi-Resolution Hierarchies , 1988, NIPS.
[633] A. K. Rigler,et al. Accelerating the convergence of the back-propagation method , 1988, Biological Cybernetics.
[634] Jürgen Schmidhuber,et al. Learning Precise Timing with LSTM Recurrent Networks , 2003, J. Mach. Learn. Res..
[635] Schuster,et al. Separation of a mixture of independent signals using time delayed correlations. , 1994, Physical review letters.
[636] James Martens,et al. Deep learning via Hessian-free optimization , 2010, ICML.
[637] Jürgen Schmidhuber,et al. Multi-column deep neural network for traffic sign classification , 2012, Neural Networks.
[638] Anders Krogh,et al. Introduction to the theory of neural computation , 1994, The advanced book program.
[639] Jürgen Schmidhuber,et al. Reinforcement Learning in Markovian and Non-Markovian Environments , 1990, NIPS.
[640] H. B. Barlow,et al. Unsupervised Learning , 1989, Neural Computation.
[641] Sebastian Otte,et al. Local Feature Based Online Mode Detection with Recurrent Neural Networks , 2012, 2012 International Conference on Frontiers in Handwriting Recognition.
[642] Jude W. Shavlik,et al. Using knowledge-based neural networks to improve algorithms: Refining the Chou-Fasman algorithm for protein folding , 2004, Machine Learning.
[643] Martin A. Riedmiller. Neural Fitted Q Iteration - First Experiences with a Data Efficient Neural Reinforcement Learning Method , 2005, ECML.
[644] Teuvo Kohonen,et al. Self-organized formation of topologically correct feature maps , 2004, Biological Cybernetics.
[645] Leemon C. Baird,et al. Residual Algorithms: Reinforcement Learning with Function Approximation , 1995, ICML.
[646] Ralph Linsker,et al. Self-organization in a perceptual network , 1988, Computer.
[647] H. B. Barlow,et al. Finding Minimum Entropy Codes , 1989, Neural Computation.
[648] Pedro M. Domingos,et al. Sum-product networks: A new deep architecture , 2011, 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops).
[649] Wolfgang Maass,et al. Lower Bounds for the Computational Power of Networks of Spiking Neurons , 1996, Neural Computation.
[650] K S Narendra,et al. Control of nonlinear dynamical systems using neural networks. II. Observability, identification, and control , 1996, IEEE Trans. Neural Networks.
[651] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[652] R. Vaillant,et al. An original approach for the localization of objects in images , 1993 .
[653] Stefan Schaal,et al. 2008 Special Issue: Reinforcement learning of motor skills with policy gradients , 2008 .
[654] Tapani Raiko,et al. Tikhonov-Type Regularization for Restricted Boltzmann Machines , 2012, ICANN.
[655] Mike Casey,et al. The Dynamics of Discrete-Time Computation, with Application to Recurrent Neural Networks and Finite State Machine Extraction , 1996, Neural Computation.
[656] Jude W. Shavlik,et al. Knowledge-Based Artificial Neural Networks , 1994, Artif. Intell..
[657] Alekseĭ Grigorʹevich Ivakhnenko,et al. Cybernetics and forecasting techniques , 1967 .
[658] Terrence J. Sejnowski,et al. Slow Feature Analysis: Unsupervised Learning of Invariances , 2002, Neural Computation.
[659] Barak A. Pearlmutter,et al. Automatic Learning Rate Maximization by On-Line Estimation of the Hessian's Eigenvectors , 1992, NIPS 1992.
[660] Bernard Widrow,et al. Neural networks: applications in industry, business and science , 1994, CACM.
[661] Dumitru Erhan,et al. Deep Neural Networks for Object Detection , 2013, NIPS.
[662] David Windisch. Loading Deep Networks Is Hard: The Pyramidal Case , 2005, Neural Computation.
[663] Tobi Delbruck,et al. Real-time classification and sensor fusion with a spiking deep belief network , 2013, Front. Neurosci..
[664] Maja J. Matarić,et al. Learning to Use Selective Attention and Short-Term Memory in Sequential Tasks , 1996 .
[665] Suzanna Becker,et al. Unsupervised Learning Procedures for Neural Networks , 1991, Int. J. Neural Syst..
[666] Ilya Sutskever,et al. Learning Recurrent Neural Networks with Hessian-Free Optimization , 2011, ICML.
[667] Thomas G. Dietterich. Hierarchical Reinforcement Learning with the MAXQ Value Function Decomposition , 1999, J. Artif. Intell. Res..
[668] Terrence J. Sejnowski,et al. An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.
[669] Kunihiko Fukushima,et al. Neocognitron: A self-organizing neural network model for a mechanism of pattern recognition unaffected by shift in position , 1980, Biological Cybernetics.
[670] Anton Gunzinger,et al. Fast neural net simulation with a DSP processor array , 1995, IEEE Trans. Neural Networks.
[671] Alfonso Valencia,et al. A hierarchical unsupervised growing neural network for clustering gene expression patterns , 2001, Bioinform..
[672] D. J. Felleman,et al. Distributed hierarchical processing in the primate cerebral cortex. , 1991, Cerebral cortex.
[673] Doina Precup,et al. Between MDPs and Semi-MDPs: A Framework for Temporal Abstraction in Reinforcement Learning , 1999, Artif. Intell..
[674] Michail G. Lagoudakis,et al. Least-Squares Policy Iteration , 2003, J. Mach. Learn. Res..
[675] Geoffrey E. Hinton,et al. Rectified Linear Units Improve Restricted Boltzmann Machines , 2010, ICML.
[676] Frank Sehnke,et al. Parameter-exploring policy gradients , 2010, Neural Networks.
[677] Mohammed Bennamoun,et al. Automatic Feature Learning for Robust Shadow Detection , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.
[678] Jürgen Schmidhuber. Discovering Solutions with Low Kolmogorov Complexity and High Generalization Capability , 1995, ICML.
[679] Christian W. Omlin,et al. A Machine Learning Method for Extracting Symbolic Knowledge from Recurrent Neural Networks , 2004, Neural Computation.
[680] Shie Mannor,et al. Q-Cut - Dynamic Discovery of Sub-goals in Reinforcement Learning , 2002, ECML.
[681] Guo-Zheng Sun,et al. Time Warping Invariant Neural Networks , 1992, NIPS.
[682] Gerhard Weiß,et al. Hierarchical Chunking in Classifier Systems , 1994, AAAI.
[683] Danil V. Prokhorov,et al. A Convolutional Learning System for Object Classification in 3-D Lidar Data , 2010, IEEE Transactions on Neural Networks.
[684] Geoffrey E. Hinton,et al. Semantic hashing , 2009, Int. J. Approx. Reason..
[685] Frank Fallside,et al. Dynamic reinforcement driven error propagation networks with application to game playing , 1989 .
[686] Bart L. M. Happel,et al. Design and evolution of modular neural network architectures , 1994, Neural Networks.
[687] Wolfgang Maass,et al. On the Computational Power of Winner-Take-All , 2000, Neural Computation.
[688] HighWire Press. Philosophical Transactions of the Royal Society of London , 1781, The London Medical Journal.
[689] Jürgen Schmidhuber,et al. Framewise phoneme classification with bidirectional LSTM and other neural network architectures , 2005, Neural Networks.
[690] Maneesh Sahani,et al. Regularization and nonlinearities for neural language models: when are they needed? , 2013, ArXiv.
[691] Patrice Y. Simard,et al. High Performance Convolutional Neural Networks for Document Processing , 2006 .
[692] Ha Hong,et al. Hierarchical Modular Optimization of Convolutional Networks Achieves Representations Similar to Macaque IT and Human Ventral Stream , 2013, NIPS.
[693] Shumeet Baluja,et al. A Method for Integrating Genetic Search Based Function Optimization and Competitive Learning , 1994 .
[694] A. Lindenmayer. Mathematical models for cellular interactions in development. I. Filaments with one-sided inputs. , 1968, Journal of theoretical biology.
[695] Radford M. Neal. Classification with Bayesian Neural Networks , 2005, MLCW.
[696] Alex Graves,et al. Playing Atari with Deep Reinforcement Learning , 2013, ArXiv.
[697] Yoshua Bengio,et al. Gradient Flow in Recurrent Nets: the Difficulty of Learning Long-Term Dependencies , 2001 .
[698] Risto Miikkulainen,et al. Evolving Keepaway Soccer Players through Task Decomposition , 2003, GECCO.
[699] Andreas Rauber,et al. The growing hierarchical self-organizing map , 2000, Proceedings of the IEEE-INNS-ENNS International Joint Conference on Neural Networks. IJCNN 2000. Neural Computing: New Challenges and Perspectives for the New Millennium.
[700] Alan J. Gross,et al. Self-Organizing Methods in Modeling , 1988 .
[701] George M. Siouris,et al. Applied Optimal Control: Optimization, Estimation, and Control , 1979, IEEE Transactions on Systems, Man, and Cybernetics.