论文信息 - Power-law scaling to assist with key challenges in artificial intelligence - 字舞流文

Power-law scaling to assist with key challenges in artificial intelligence

Power-law scaling, a central concept in critical phenomena, is found to be useful in deep learning, where optimized test errors on handwritten digit examples converge as a power-law to zero with database size. For rapid decision making with one training epoch, each example is presented only once to the trained network, the power-law exponent increased with the number of hidden layers. For the largest dataset, the obtained test error was estimated to be in the proximity of state-of-the-art algorithms for large epoch numbers. Power-law scaling assists with key challenges found in current artificial intelligence applications and facilitates an a priori dataset size estimation to achieve a desired test accuracy. It establishes a benchmark for measuring training complexity and a quantitative hierarchy of machine learning tasks and algorithms.

Ido Kanter | Shira Sardi | Yuval Meir | Shiri Hodassman | Amir Goldental | Karin Kisos | Itamar Ben-Noam | I. Kanter | A. Goldental | Shira Sardi | Shiri Hodassman | Yuval Meir | Karin Kisos | Itamar Ben-Noam

[1] Cordelia Schmid,et al. IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2004, Washington, DC, USA, June 27 - July 2, 2004 , 2004, CVPR Workshops.

[2] Sridhar Narayan,et al. The Generalized Sigmoid Activation Function: Competetive Supervised Learning , 1997, Inf. Sci..

[3] Ying Zhang,et al. A strategy to apply machine learning to small datasets in materials science , 2018, npj Computational Materials.

[4] Harry Eugene Stanley,et al. Catastrophic cascade of failures in interdependent networks , 2009, Nature.

[5] Albert-László Barabási,et al. Statistical mechanics of complex networks , 2001, ArXiv.

[6] Shruti Mishra,et al. Machine learning in a data-limited regime: Augmenting experiments with synthetic data uncovers order in crumpled sheets , 2018, Science Advances.

[7] She,et al. Universal scaling laws in fully developed turbulence. , 1994, Physical review letters.

[8] Harris Drucker,et al. Learning algorithms for classification: A comparison on handwritten digit recognition , 1995 .

[9] Christopher C. Cline,et al. Noninvasive neuroimaging enhances continuous neural tracking for robotic device control , 2019, Science Robotics.

[10] Shang‐keng Ma. Modern Theory of Critical Phenomena , 1976 .

[11] Kanter,et al. Markov processes: Linguistics and Zipf's law. , 1995, Physical review letters.

[12] Lei Wang,et al. Discovering phase transitions with unsupervised learning , 2016, 1606.00318.

[13] Yang Yang,et al. Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.

[14] Geoffrey E. Hinton,et al. Learning representations by back-propagating errors , 1986, Nature.

[15] Ido Kanter,et al. Brain experiments imply adaptation mechanisms which outperform common AI learning algorithms , 2020, Scientific Reports.

[16] George Barbastathis,et al. Low Photon Count Phase Retrieval Using Deep Learning. , 2018, Physical review letters.

[17] K. Wilson. The renormalization group: Critical phenomena and the Kondo problem , 1975 .

[18] Po-Yao Huang,et al. Structural Analysis and Optimization of Convolutional Neural Networks with a Small Sample Size , 2020, Scientific Reports.

[19] Sunghak Lee,et al. Understanding the physical metallurgy of the CoCrFeMnNi high-entropy alloy: an atomistic simulation study , 2018, npj Computational Materials.

[20] Pietro Perona,et al. Learning Generative Visual Models from Few Training Examples: An Incremental Bayesian Approach Tested on 101 Object Categories , 2004, 2004 Conference on Computer Vision and Pattern Recognition Workshop.

[21] Alex Krizhevsky,et al. Learning Multiple Layers of Features from Tiny Images , 2009 .

[22] Jonathan S. Rosenfeld,et al. A Constructive Prediction of the Generalization Error Across Scales , 2020, ICLR.

[23] Michael S. Bernstein,et al. ImageNet Large Scale Visual Recognition Challenge , 2014, International Journal of Computer Vision.

[24] M. Chorilli,et al. Terpinen-4-ol and nystatin co-loaded precursor of liquid crystalline system for topical treatment of oral candidiasis , 2020, Scientific Reports.

[25] J. Nathan Kutz,et al. Putting a bug in ML: The moth olfactory network learns to read MNIST , 2018, Neural Networks.

[26] X. Gabaix. Power Laws in Economics and Finance , 2008 .

[27] Hongyu Shen,et al. Enabling real-time multi-messenger astrophysics discoveries with deep learning , 2019, Nature Reviews Physics.

[28] Kim Christensen,et al. Unified scaling law for earthquakes. , 2001, Physical review letters.

[29] M. Mildner,et al. Re-epithelialization and immune cell behaviour in an ex vivo human skin model , 2020, Scientific Reports.

[30] Jeffrey G. Ojemann,et al. Power-Law Scaling in the Brain Surface Electric Potential , 2009, PLoS Comput. Biol..

[31] S. Havlin,et al. Self-similarity of complex networks , 2005, Nature.

[32] D. Whiteson,et al. Deep Learning and Its Application to LHC Physics , 2018, Annual Review of Nuclear and Particle Science.

[33] Roland Bouffanais,et al. Optimal network topology for responsive collective behavior , 2018, Science Advances.

[34] Lada A. Adamic,et al. Power-Law Distribution of the World Wide Web , 2000, Science.