Beyond Independent Components: Trees and Clusters

We present a generalization of independent component analysis (ICA), where instead of looking for a linear transform that makes the data components independent, we look for a transform that makes the data components well fit by a tree-structured graphical model. This tree-dependent component analysis (TCA) provides a tractable and flexible approach to weakening the assumption of independence in ICA. In particular, TCA allows the underlying graph to have multiple connected components, and thus the method is able to find "clusters" of components such that components are dependent within a cluster and independent between clusters. Finally, we make use of a notion of graphical models for time series due to Brillinger (1996) to extend these ideas to the temporal setting. In particular, we are able to fit models that incorporate tree-structured dependencies among multiple time series.

[1]  C. N. Liu,et al.  Approximating discrete probability distributions with dependence trees , 1968, IEEE Trans. Inf. Theory.

[2]  J. Rissanen,et al.  Modeling By Shortest Data Description* , 1978, Autom..

[3]  P. J. Green,et al.  Density Estimation for Statistics and Data Analysis , 1987 .

[4]  Thomas M. Cover,et al.  Elements of Information Theory , 2005 .

[5]  Radim Jirousek,et al.  Solution of the marginal problem and decomposable distributions , 1991, Kybernetika.

[6]  Richard A. Davis,et al.  Time Series: Theory and Methods , 2013 .

[7]  Geoffrey E. Hinton,et al.  Adaptive Mixtures of Local Experts , 1991, Neural Computation.

[8]  P. Bickel Efficient and Adaptive Estimation for Semiparametric Models , 1993 .

[9]  A. Dawid,et al.  Hyper Markov Laws in the Statistical Analysis of Decomposable Graphical Models , 1993 .

[10]  Pierre Comon,et al.  Independent component analysis, A new concept? , 1994, Signal Process..

[11]  W. Clem Karl,et al.  Efficient multiscale regularization with applications to the computation of optical flow , 1994, IEEE Trans. Image Process..

[12]  K. Do,et al.  Efficient and Adaptive Estimation for Semiparametric Models. , 1994 .

[13]  Terrence J. Sejnowski,et al.  An Information-Maximization Approach to Blind Separation and Blind Deconvolution , 1995, Neural Computation.

[14]  Andrzej Cichocki,et al.  A New Learning Algorithm for Blind Signal Separation , 1995, NIPS.

[15]  Dinh-Tuan Pham,et al.  Blind separation of instantaneous mixture of sources via an independent component analysis , 1996, IEEE Trans. Signal Process..

[16]  Nir Friedman,et al.  Learning Bayesian Networks with Local Structure , 1996, UAI.

[17]  D. Brillinger Remarks Concerning Graphical Models for Time Series and Point Processes , 1996 .

[18]  Eric Moulines,et al.  A blind source separation technique using second-order statistics , 1997, IEEE Trans. Signal Process..

[19]  John N. Tsitsiklis,et al.  Introduction to linear optimization , 1997, Athena scientific optimization and computation series.

[20]  Andreas Ziehe,et al.  TDSEP { an e(cid:14)cient algorithm for blind separation using time structure , 1998 .

[21]  .. W. V. Der,et al.  On Profile Likelihood , 2000 .

[22]  Jean-François Cardoso,et al.  Multidimensional independent component analysis , 1998, Proceedings of the 1998 IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP '98 (Cat. No.98CH36181).

[23]  Alexander J. Smola,et al.  Learning with kernels , 1998 .

[24]  Michael I. Jordan Graphical Models , 2003 .

[25]  Hagai Attias,et al.  Independent Factor Analysis , 1999, Neural Computation.

[26]  Shotaro Akaho,et al.  MICA: multimodal independent component analysis , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[27]  J. Raz,et al.  A Simple GCV Method of Span Selection for Periodigram Smoothing , 1999 .

[28]  Jean-Franois Cardoso High-Order Contrasts for Independent Component Analysis , 1999, Neural Computation.

[29]  R. Dahlhaus Graphical interaction models for multivariate time series1 , 2000 .

[30]  Dinh Tuan Pham,et al.  Blind separation of instantaneous mixture of sources via the Gaussian mutual information criterion , 2000, 2000 10th European Signal Processing Conference.

[31]  J. Pearl Causality: Models, Reasoning and Inference , 2000 .

[32]  Aapo Hyvärinen,et al.  Emergence of Phase- and Shift-Invariant Features by Decomposition of Natural Images into Independent Feature Subspaces , 2000, Neural Computation.

[33]  Nello Cristianini,et al.  Classification using String Kernels , 2000 .

[34]  J. Raz,et al.  A simple generalised crossvalidation method of span selection for periodogram smoothing , 2001 .

[35]  Max Welling,et al.  A Constrained EM Algorithm for Independent Component Analysis , 2001, Neural Computation.

[36]  D. Pham CONTRAST FUNCTIONS FOR BLIND SEPARATION AND DECONVOLUTION OF SOURCES , 2001 .

[37]  Aapo Hyvärinen,et al.  Topographic Independent Component Analysis , 2001, Neural Computation.

[38]  A. Willsky Multiresolution Markov models for signal and image processing , 2002, Proc. IEEE.

[39]  Michael I. Jordan,et al.  Learning Graphical Models with Mercer Kernels , 2002, NIPS.

[40]  Robert Tibshirani,et al.  Independent Components Analysis through Product Density Estimation , 2002, NIPS.

[41]  Dinh-Tuan Pham,et al.  Mutual information approach to blind separation of stationary sources , 2002, IEEE Trans. Inf. Theory.

[42]  Michael I. Jordan,et al.  Kernel independent component analysis , 2003, 2003 IEEE International Conference on Acoustics, Speech, and Signal Processing, 2003. Proceedings. (ICASSP '03)..

[43]  John W. Fisher,et al.  ICA Using Spacings Estimates of Entropy , 2003, J. Mach. Learn. Res..

[44]  D. Pham FAST ALGORITHM FOR ESTIMATING MUTUAL INFORMATION, ENTROPIES AND SCORE FUNCTIONS , 2003 .

[45]  Eric R. Ziegel,et al.  The Elements of Statistical Learning , 2003, Technometrics.

[46]  David Maxwell Chickering,et al.  Learning Bayesian Networks: The Combination of Knowledge and Statistical Data , 1994, Machine Learning.

[47]  Michael I. Jordan,et al.  Dimensionality Reduction for Supervised Learning with Reproducing Kernel Hilbert Spaces , 2004, J. Mach. Learn. Res..

[48]  N. Davies Multiple Time Series , 2005 .