论文信息 - iNALU: Improved Neural Arithmetic Logic Unit

iNALU: Improved Neural Arithmetic Logic Unit

Neural networks have to capture mathematical relationships in order to learn various tasks. They approximate these relations implicitly and therefore often do not generalize well. The recently proposed Neural Arithmetic Logic Unit (NALU) is a novel neural architecture which is able to explicitly represent the mathematical relationships by the units of the network to learn operations such as summation, subtraction or multiplication. Although NALUs have been shown to perform well on various downstream tasks, an in-depth analysis reveals practical shortcomings by design, such as the inability to multiply or divide negative input values or training stability issues for deeper networks. We address these issues and propose an improved model architecture. We evaluate our model empirically in various settings from learning basic arithmetic operations to more complex functions. Our experiments indicate that our model solves stability issues and outperforms the original NALU model in means of arithmetic precision and convergence.

[1] Lukasz Kaiser,et al. Neural GPUs Learn Algorithms , 2015, ICLR.

[2] Karlis Freivalds,et al. Improving the Neural GPU Architecture for Algorithm Learning , 2017, ArXiv.

[3] Andrew Zisserman,et al. Microscopy cell counting with fully convolutional regression networks , 2015 .

[4] Chris Dyer,et al. Neural Arithmetic Logic Units , 2018, NeurIPS.

[5] Yoshua Bengio,et al. Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[6] Nando de Freitas,et al. Neural Programmer-Interpreters , 2015, ICLR.

[7] Xipeng Qiu,et al. Neural Arithmetic Expression Calculator , 2018, ArXiv.

[8] Alejandro Zunino,et al. An empirical comparison of botnet detection methods , 2014, Comput. Secur..

[9] Andreas Hotho,et al. Flow-based Network Traffic Generation using Generative Adversarial Networks , 2018, Comput. Secur..

[10] Wojciech Zaremba,et al. Learning to Execute , 2014, ArXiv.

[11] Stefan Axelsson,et al. Paysim: a financial mobile money simulator for fraud detection , 2016 .

[12] Léon Bottou,et al. Wasserstein Generative Adversarial Networks , 2017, ICML.

[13] Alexander Rosenberg Johansen,et al. Measuring Arithmetic Extrapolation Performance , 2019, NeurIPS 2019.

[14] Xiaogang Wang,et al. Cross-scene crowd counting via deep convolutional neural networks , 2015, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[15] Alex Graves,et al. Grid Long Short-Term Memory , 2015, ICLR.

[16] Andrew Zisserman,et al. Microscopy cell counting and detection with fully convolutional regression networks , 2018, Comput. methods Biomech. Biomed. Eng. Imaging Vis..

[17] Jimmy Ba,et al. Adam: A Method for Stochastic Optimization , 2014, ICLR.

[18] David J. Hand,et al. Statistical fraud detection: A review , 2002 .