Extending the Beta divergence to complex values

Abstract Various information-theoretic divergences have been proposed for the cost function in tasks such as matrix factorization and clustering. One class of divergence is called the Beta divergence. By varying a real-valued parameter β , the Beta divergence connects several well-known divergences, such as the Euclidean distance, Kullback-Leibler divergence, and Itakura-Saito divergence. Unfortunately, the Beta divergence is properly defined only for positive real values, hindering its use for measuring distances between complex-valued data points. We define a new divergence, the Complex Beta divergence, that operates on complex values, and show that it coincides with the standard Beta divergence when the data is restricted to be in phase. Moreover, we show that different values of β place different penalties on errors in magnitude and phase.

[1]  T. Morimoto Markov Processes and the H -Theorem , 1963 .

[2]  Parham Aarabi,et al.  On the importance of phase in human speech recognition , 2006, IEEE Transactions on Audio, Speech, and Language Processing.

[3]  E. Hellinger,et al.  Neue Begründung der Theorie quadratischer Formen von unendlichvielen Veränderlichen. , 1909 .

[4]  M. C. Jones,et al.  Robust and efficient estimation by minimising a density power divergence , 1998 .

[5]  Stéphane Mallat,et al.  Deep roto-translation scattering for object classification , 2014, 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR).

[6]  H. Chernoff A Measure of Asymptotic Efficiency for Tests of a Hypothesis Based on the sum of Observations , 1952 .

[7]  Tülay Adali,et al.  Approximation by Fully Complex Multilayer Perceptrons , 2003, Neural Computation.

[8]  Juan José Murillo-Fuentes,et al.  Widely Linear Complex-Valued Kernel Methods for Regression , 2016, IEEE Transactions on Signal Processing.

[9]  J. Lafferty Additive models, boosting, and inference for generalized divergences , 1999, COLT '99.

[10]  Thomas Villmann,et al.  Divergence-Based Vector Quantization , 2011, Neural Computation.

[11]  Roland Badeau,et al.  Complex NMF under phase constraints based on signal modeling: Application to audio source separation , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[12]  Thomas Serre,et al.  Neuronal Synchrony in Complex-Valued Deep Networks , 2013, ICLR.

[13]  W. H. Young On Classes of Summable Functions and their Fourier Series , 1912 .

[14]  Shun-ichi Amari,et al.  $\alpha$ -Divergence Is Unique, Belonging to Both $f$-Divergence and Bregman Divergence Classes , 2009, IEEE Transactions on Information Theory.

[15]  R. A. Leibler,et al.  On Information and Sufficiency , 1951 .

[16]  Hirokazu Kameoka,et al.  Complex NMF with the generalized Kullback-Leibler divergence , 2017, 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[17]  Sergio Cruces,et al.  Generalized Alpha-Beta Divergences and Their Application to Robust Nonnegative Matrix Factorization , 2011, Entropy.

[18]  Jérôme Idier,et al.  Algorithms for nonnegative matrix factorization with the beta-divergence , 2010, ArXiv.

[19]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[20]  S. M. Ali,et al.  A General Class of Coefficients of Divergence of One Distribution from Another , 1966 .

[21]  Sergios Theodoridis,et al.  Ieee Transactions on Signal Processing Extension of Wirtinger's Calculus to Reproducing Kernel Hilbert Spaces and the Complex Kernel Lms , 2022 .

[22]  Cris Koutsougeras,et al.  Complex domain backpropagation , 1992 .

[23]  Mihoko Minami,et al.  Robust Blind Source Separation by Beta Divergence , 2002, Neural Computation.

[24]  P. Paatero,et al.  Positive matrix factorization: A non-negative factor model with optimal utilization of error estimates of data values† , 1994 .

[25]  Yuandong Tian,et al.  Scale-invariant learning and convolutional networks , 2015, ArXiv.

[26]  Guillaume Bouchard,et al.  Complex Embeddings for Simple Link Prediction , 2016, ICML.

[27]  Les Atlas,et al.  New methods of complex matrix factorization for single-channel source separation and analysis , 2012 .

[28]  Hyenkyun Woo,et al.  Besta-Divergence-Based Variational Model for Speckle Reduction , 2016, IEEE Signal Processing Letters.

[29]  Junichi Yamagishi,et al.  Initial investigation of speech synthesis based on complex-valued neural networks , 2016, 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP).

[30]  Hirokazu Kameoka,et al.  Complex NMF: A new sparse representation for acoustic signals , 2009, 2009 IEEE International Conference on Acoustics, Speech and Signal Processing.

[31]  AmariShun-Ichi α-divergence is unique, belonging to both f-divergence and Bregman divergence classes , 2009 .

[32]  Sandeep Subramanian,et al.  Deep Complex Networks , 2017, ICLR.

[33]  H. Sebastian Seung,et al.  Algorithms for Non-negative Matrix Factorization , 2000, NIPS.

[34]  Jérôme Idier,et al.  Algorithms for Nonnegative Matrix Factorization with the β-Divergence , 2010, Neural Computation.