Estimating the Required Training Dataset Size for Transmitter Classification Using Deep Learning

Despite the recent surge in the application of deep learning to wireless communication problems, very little is known about the required training dataset size to solve difficult problems with acceptable accuracy, including the problem of transmitter classification. Many researchers use rules-of-thumb to find out how much training data is needed for certain classification or identification tasks. For the artificial neural network (ANN) research, these rules of thumb may suffice, however, for convolutional neural networks (CNN), a class of deep neural networks, these rules of thumb may not hold, and researchers are often left to Figure out the training dataset size needed for accurate classification. In this paper, we investigate the correlation between training dataset size and classification accuracy for transmitter classification applications by investigating whether the rules-of-thumb used in ANN research applies in CNN-based transmitter classification tasks. We predict classification performance of a CNN-based architecture given a dataset size using a power law model and the Levenberg-Marquardt algorithm. We use the chi-squared goodness-of-fit test to validate our predicted model. Our results show that we can predict classification accuracy for larger training dataset sizes with different experimental scenarios with at least 97.5% accuracy. We also compare our scheme with similar prior works in wireless transmitter classification. Finally, we propose a rule-of-thumb for the required training dataset size in transmitter classification using CNNs.

[1]  Sayan Mukherjee,et al.  Estimating Dataset Size Requirements for Classifying DNA Microarray Data , 2003, J. Comput. Biol..

[2]  Yang Yang,et al.  Deep Learning Scaling is Predictable, Empirically , 2017, ArXiv.

[3]  Caspar G. Chorus,et al.  Is your dataset big enough? Sample size requirements when using artificial neural networks for discrete choice analysis , 2018, Journal of Choice Modelling.

[4]  Hazem H. Refai,et al.  Wireless technology identification using deep Convolutional Neural Networks , 2017, 2017 IEEE 28th Annual International Symposium on Personal, Indoor, and Mobile Radio Communications (PIMRC).

[5]  Farinaz Edalat,et al.  Effect of power amplifier nonlinearity on system performance metric, bit-error-rate (BER) , 2003 .

[6]  Qing Zeng-Treitler,et al.  Predicting sample size required for classification performance , 2012, BMC Medical Informatics and Decision Making.

[7]  Aakanksha Chowdhery,et al.  TxMiner: Identifying transmitters in real-world spectrum measurements , 2015, 2015 IEEE International Symposium on Dynamic Spectrum Access Networks (DySPAN).

[8]  Andrew Zisserman,et al.  Deep Face Recognition , 2015, BMVC.

[9]  Arto Klami,et al.  Semi-supervised Convolutional Neural Networks for Identifying Wi-Fi Interference Sources , 2017, ACML.

[10]  Synho Do,et al.  How much data is needed to train a medical image deep learning system to achieve necessary high accuracy , 2015, 1511.06348.

[11]  Panlop Zeephongsekul,et al.  Predicting the Relationship Between the Size of Training Sample and the Predictive Power of Classifiers , 2004, KES.

[12]  Khan M. Iftekharuddin,et al.  The continuous wavelet transform , 2012, Wavelets.

[13]  Gregory Cohen,et al.  EMNIST: an extension of MNIST to handwritten letters , 2017, CVPR 2017.

[14]  Mark E. J. Newman,et al.  Power-Law Distributions in Empirical Data , 2007, SIAM Rev..

[15]  Elie Bienenstock,et al.  Neural Networks and the Bias/Variance Dilemma , 1992, Neural Computation.

[16]  Bishal Thapa,et al.  Machine Learning Approach to RF Transmitter Identification , 2017, IEEE Journal of Radio Frequency Identification.

[17]  Y. Abu-Mostafa Machines that Learn from Hints , 1995 .

[18]  Uwe Meier,et al.  Wireless interference identification with convolutional neural networks , 2017, 2017 IEEE 15th International Conference on Industrial Informatics (INDIN).

[19]  T. Charles Clancy,et al.  Convolutional Radio Modulation Recognition Networks , 2016, EANN.

[20]  Tatiana Baidyk,et al.  Improved method of handwritten digit recognition tested on MNIST database , 2004, Image Vis. Comput..

[21]  Ingrid Moerman,et al.  End-to-End Learning From Spectrum Data: A Deep Learning Approach for Wireless Signal Identification in Spectrum Monitoring Applications , 2017, IEEE Access.

[22]  Dennis Goeckel,et al.  Identifying Wireless Users via Transmitter Imperfections , 2011, IEEE Journal on Selected Areas in Communications.

[23]  Jürgen Schmidhuber,et al.  Transfer learning for Latin and Chinese characters with Deep Neural Networks , 2012, The 2012 International Joint Conference on Neural Networks (IJCNN).

[24]  Sergey Levine,et al.  Learning Hand-Eye Coordination for Robotic Grasping with Large-Scale Data Collection , 2016, ISER.

[25]  Stratis Ioannidis,et al.  Deep Learning Convolutional Neural Networks for Radio Identification , 2018, IEEE Communications Magazine.

[26]  Jürgen Schmidhuber,et al.  Multi-column deep neural networks for image classification , 2012, 2012 IEEE Conference on Computer Vision and Pattern Recognition.

[27]  Shauna Revay,et al.  Deep Learning for RF Device Fingerprinting in Cognitive Communication Networks , 2018, IEEE Journal of Selected Topics in Signal Processing.

[28]  Jean-Marie Gorce,et al.  Transmitter Classification With Supervised Deep Learning , 2019, CrownCom.