Towards a Unified Framework of Matrix Derivatives

The need of processing and analyzing massive statistics simultaneously requires the derivatives of matrix-to-scalar functions (scalar-valued functions of matrices) or matrix-to-matrix functions (matrix-valued functions of matrices). Although derivatives of a matrix-to-scalar function have already been defined, the way to express it in algebraic expression, however, is not as clear as that of scalar-to-scalar functions (scalar-valued functions of scalars). Due to the fact that there does not exist a uniform way of applying “chain rule” on matrix derivation, we classify approaches utilized in existing schemes into two ways: the first relies on the index notation of several matrices, and they would be eliminated while being multiplied; the second relies on the vectorizing of matrices and thus they can be dealt with in the way we treat vector-to-vector functions (vector-valued functions of vectors), which has already been settled. On one hand, we find that the first approach holds a much lower time complexity than that of the second approach in general. On the other hand, until now though we know most typical functions that can be derived in the first approach, theoretically the second approach is more generally fit for any routine of ”chain rule.” The result of the second approach, nevertheless, can be also simplified to the same order of time complexity with the first approach under certain conditions. Therefore, it is important to establish these conditions. In this paper, we establish a sufficient condition under which not only the first approach can be applied but also the time complexity of results obtained from the second approach can be reduced. This condition is described in two equivalent individual conditions, each of which is a counterpart of an approach sequentially. In addition, we generalize the methods and use these two approaches to do the derivatives under the two conditions individually. This paper enables us to unify the framework of matrix derivatives, which would result in various applications in science and engineering.

[1]  W. Vetter Derivative operations on matrices , 1970 .

[2]  Y.-Y. Liu,et al.  The fundamental advantages of temporal networks , 2016, Science.

[3]  Danilo P. Mandic,et al.  The Theory of Quaternion Matrix Derivatives , 2014, IEEE Transactions on Signal Processing.

[4]  W. Hackbusch,et al.  On the Convergence of Alternating Least Squares Optimisation in Tensor Format Representations , 2015, 1506.00062.

[5]  T. Minka Old and New Matrix Algebra Useful for Statistics , 2000 .

[6]  André Uschmajew,et al.  Local Convergence of the Alternating Least Squares Algorithm for Canonical Tensor Approximation , 2012, SIAM J. Matrix Anal. Appl..

[7]  Pierre Comon,et al.  Enhanced Line Search: A Novel Method to Accelerate PARAFAC , 2008, SIAM J. Matrix Anal. Appl..

[8]  M E J Newman,et al.  Modularity and community structure in networks. , 2006, Proceedings of the National Academy of Sciences of the United States of America.

[9]  Xuelong Li,et al.  General Tensor Discriminant Analysis and Gabor Features for Gait Recognition , 2007, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[10]  Jianpeng Ma,et al.  Iterative LMMSE Individual Channel Estimation Over Relay Networks With Multiple Antennas , 2018, IEEE Transactions on Vehicular Technology.

[11]  Erik W. Grafarend,et al.  Fourth order Taylor–Kármán structured covariance tensor for gravity gradient predictions by means of the Hankel transformation , 2015 .

[12]  Guoqi Li,et al.  Minimum-cost control of complex networks , 2015 .

[13]  Lei Wang,et al.  Efficient Dual Approach to Distance Metric Learning , 2013, IEEE Transactions on Neural Networks and Learning Systems.

[14]  Ziyang Meng,et al.  Boundary Constraints for Minimum Cost Control of Directed Networks , 2017, IEEE Transactions on Cybernetics.

[15]  E. C. Macrae Matrix Derivatives with an Application to an Adaptive Linear Decision Problem , 1974 .

[16]  Zeyad Abdel Aziz Al Zhour The general (vector) solutions of such linear (coupled) matrix fractional differential equations by using Kronecker structures , 2014, Appl. Math. Comput..

[17]  Milan Sonka,et al.  Image Processing, Analysis and Machine Vision , 1993, Springer US.

[18]  Jie Ren,et al.  Controlling complex networks: How much energy is needed? , 2012, Physical review letters.

[19]  P. S. Dwyer Some Applications of Matrix Derivatives in Multivariate Analysis , 1967 .

[20]  Wei Yu,et al.  Iterative water-filling for Gaussian vector multiple-access channels , 2001, IEEE Transactions on Information Theory.

[21]  Rong Wang,et al.  Robust 2DPCA With Non-greedy $\ell _{1}$ -Norm Maximization for Image Analysis , 2015, IEEE Transactions on Cybernetics.

[22]  Hongyu Zhao,et al.  Normalized modularity optimization method for community identification with degree adjustment. , 2013, Physical review. E, Statistical, nonlinear, and soft matter physics.

[23]  Chengwen Xing,et al.  Matrix-Monotonic Optimization for MIMO Systems , 2013, IEEE Transactions on Signal Processing.

[24]  Guoqi Li,et al.  Matrix differentiation for capacity region of Gaussian multiple access channels under weighted total power constraint , 2017, Annals of Telecommunications.

[25]  Fei Zeng,et al.  Matrix Function Optimization Problems Under Orthonormal Constraint , 2020, IEEE Transactions on Systems, Man, and Cybernetics: Systems.

[26]  Are Hjørungnes,et al.  Complex-Valued Matrix Differentiation: Techniques and Key Results , 2007, IEEE Transactions on Signal Processing.

[27]  J. Magnus,et al.  Matrix Differential Calculus with Applications in Statistics and Econometrics , 1991 .

[28]  Tingwen Huang,et al.  Controllability and Synchronization Analysis of Identical-Hierarchy Mixed-Valued Logical Control Networks , 2017, IEEE Transactions on Cybernetics.

[29]  Feiping Nie,et al.  Trace Ratio Problem Revisited , 2009, IEEE Transactions on Neural Networks.

[30]  Nikos D. Sidiropoulos,et al.  Tensor Decomposition for Signal Processing and Machine Learning , 2016, IEEE Transactions on Signal Processing.

[31]  Min Han,et al.  A fully automatic ocular artifact removal from EEG based on fourth-order tensor method , 2014 .

[32]  Kaare Brandt Petersen,et al.  The Matrix Cookbook , 2006 .