On Complex Valued Convolutional Neural Networks

Convolutional neural networks (CNNs) are the cutting edge model for supervised machine learning in computer vision. In recent years CNNs have outperformed traditional approaches in many computer vision tasks such as object detection, image classification and face recognition. CNNs are vulnerable to overfitting, and a lot of research focuses on finding regularization methods to overcome it. One approach is designing task specific models based on prior knowledge. Several works have shown that properties of natural images can be easily captured using complex numbers. Motivated by these works, we present a variation of the CNN model with complex valued input and weights. We construct the complex model as a generalization of the real model. Lack of order over the complex field raises several difficulties both in the definition and in the training of the network. We address these issues and suggest possible solutions. The resulting model is shown to be a restricted form of a real valued CNN with twice the parameters. It is sensitive to phase structure, and we suggest it serves as a regularized model for problems where such structure is important. This suggestion is verified empirically by comparing the performance of a complex and a real network in the problem of cell detection. The two networks achieve comparable results, and although the complex model is hard to train, it is significantly less vulnerable to overfitting. We also demonstrate that the complex network detects meaningful phase structure in the data.

[1]  Pekka Ruusuvuori,et al.  Computational Framework for Simulating Fluorescence Microscope Images With Cell Populations , 2007, IEEE Transactions on Medical Imaging.

[2]  Mark Tygert,et al.  A theoretical argument for complex-valued convolutional networks , 2015, ArXiv.

[3]  Geoffrey E. Hinton,et al.  On the importance of initialization and momentum in deep learning , 2013, ICML.

[4]  Geoffrey E. Hinton,et al.  ImageNet classification with deep convolutional neural networks , 2012, Commun. ACM.

[5]  Yoshua Bengio,et al.  Gradient-based learning applied to document recognition , 1998, Proc. IEEE.

[6]  Michael Werman,et al.  Complex-valued hough transforms for circles , 2015, 2015 IEEE International Conference on Image Processing (ICIP).

[7]  M. Irani Vision Day Schedule Time Speaker and Collaborators Affiliation Title a General Preprocessing Method for Improved Performance of Epipolar Geometry Estimation Algorithms on the Expressive Power of Deep Learning: a Tensor Analysis , 2016 .

[8]  Shai Ben-David,et al.  Understanding Machine Learning: From Theory to Algorithms , 2014 .

[9]  A. Ravishankar Rao,et al.  Unsupervised Segmentation With Dynamical Units , 2008, IEEE Transactions on Neural Networks.

[10]  Yann LeCun,et al.  Regularization of Neural Networks using DropConnect , 2013, ICML.

[11]  Ming Yang,et al.  3D Convolutional Neural Networks for Human Action Recognition , 2010, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Jürgen Schmidhuber,et al.  Deep learning in neural networks: An overview , 2014, Neural Networks.

[13]  Stéphane Mallat,et al.  A Wavelet Tour of Signal Processing - The Sparse Way, 3rd Edition , 2008 .

[14]  Trevor Darrell,et al.  Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation , 2013, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[15]  Guigang Zhang,et al.  Deep Learning , 2016, Int. J. Semantic Comput..

[16]  Nitish Srivastava,et al.  Dropout: a simple way to prevent neural networks from overfitting , 2014, J. Mach. Learn. Res..

[17]  Ming Yang,et al.  DeepFace: Closing the Gap to Human-Level Performance in Face Verification , 2014, 2014 IEEE Conference on Computer Vision and Pattern Recognition.

[18]  Geoffrey E. Hinton,et al.  Phoneme recognition using time-delay neural networks , 1989, IEEE Trans. Acoust. Speech Signal Process..

[19]  Ebru Arisoy,et al.  Low-rank matrix factorization for Deep Neural Network training with high-dimensional output targets , 2013, 2013 IEEE International Conference on Acoustics, Speech and Signal Processing.

[20]  Yoshua Bengio,et al.  Understanding the difficulty of training deep feedforward neural networks , 2010, AISTATS.

[21]  Yoshua Bengio,et al.  Convolutional networks for images, speech, and time series , 1998 .

[22]  Henry Leung,et al.  The complex backpropagation algorithm , 1991, IEEE Trans. Signal Process..

[23]  Sida I. Wang,et al.  Dropout Training as Adaptive Regularization , 2013, NIPS.

[24]  Thomas Serre,et al.  Neuronal Synchrony in Complex-Valued Deep Networks , 2013, ICLR.

[25]  Cris Koutsougeras,et al.  Complex domain backpropagation , 1992 .

[26]  S. Mallat,et al.  Invariant Scattering Convolution Networks , 2013, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[27]  Tülay Adali,et al.  Approximation by Fully Complex Multilayer Perceptrons , 2003, Neural Computation.

[28]  Tülay Adali,et al.  Fully Complex Multi-Layer Perceptron Network for Nonlinear Signal Processing , 2002, J. VLSI Signal Process..