暂无分享,去创建一个
Shao-Lun Huang | Lizhong Zheng | Gregory W. Wornell | Anuran Makur | Shao-Lun Huang | Lizhong Zheng | G. Wornell | A. Makur
[1] M. A. Chmielewski,et al. Elliptically Symmetric Distributions: A Review and Bibliography , 1981 .
[2] Venkat Anantharam,et al. On hypercontractivity and the mutual information between Boolean functions , 2013, 2013 51st Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[3] Reza Modarres,et al. Measures of Dependence , 2011, International Encyclopedia of Statistical Science.
[4] Lizhong Zheng,et al. Bounds between contraction coefficients , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[5] Venkat Anantharam,et al. Non-interactive simulation of joint distributions: The Hirschfeld-Gebelein-Rényi maximal correlation and the hypercontractivity ribbon , 2012, 2012 50th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[6] H. Hirschfeld. A Connection between Correlation and Contingency , 1935, Mathematical Proceedings of the Cambridge Philosophical Society.
[7] H. Gebelein. Das statistische Problem der Korrelation als Variations‐ und Eigenwertproblem und sein Zusammenhang mit der Ausgleichsrechnung , 1941 .
[8] Naftali Tishby,et al. Document clustering using word clusters via the information bottleneck method , 2000, SIGIR '00.
[9] A. Dawid. Spherical Matrix Distributions and a Multivariate Model , 1977 .
[10] Paul W. Cuff,et al. Gaussian secure source coding and Wyner's Common Information , 2015, 2015 IEEE International Symposium on Information Theory (ISIT).
[11] Shao-Lun Huang,et al. An efficient algorithm for information decomposition and extraction , 2015, 2015 53rd Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[12] Omer Levy,et al. Neural Word Embedding as Implicit Matrix Factorization , 2014, NIPS.
[13] Karl Pearson F.R.S.. LIII. On lines and planes of closest fit to systems of points in space , 1901 .
[14] D. Sorensen. Numerical methods for large eigenvalue problems , 2002, Acta Numerica.
[15] K. Pearson. Contributions to the Mathematical Theory of Evolution , 1894 .
[16] G. Young. Maximum likelihood estimation and factor analysis , 1941 .
[17] Radford M. Neal. Pattern Recognition and Machine Learning , 2007, Technometrics.
[18] Tian Zhang,et al. BIRCH: an efficient data clustering method for very large databases , 1996, SIGMOD '96.
[19] R. Clarke,et al. Theory and Applications of Correspondence Analysis , 1985 .
[20] Martin J. Wainwright,et al. Estimation of (near) low-rank matrices with noise and high-dimensional scaling , 2009, ICML.
[21] Punyashloka Biswal,et al. Hypercontractivity and its applications , 2011, ArXiv.
[22] Corinna Cortes,et al. Support-Vector Networks , 1995, Machine Learning.
[23] H. O. Lancaster. The Structure of Bivariate Distributions , 1958 .
[24] Andrea Montanari,et al. Matrix completion from a few entries , 2009, 2009 IEEE International Symposium on Information Theory.
[25] Allen Gersho,et al. Vector quantization and signal compression , 1991, The Kluwer international series in engineering and computer science.
[26] Jeff A. Bilmes,et al. Deep Canonical Correlation Analysis , 2013, ICML.
[27] W. F. Kibble. An extension of a theorem of Mehler's on Hermite polynomials , 1945, Mathematical Proceedings of the Cambridge Philosophical Society.
[28] Jeffrey Dean,et al. Efficient Estimation of Word Representations in Vector Space , 2013, ICLR.
[29] Yoshua Bengio,et al. Gradient-based learning applied to document recognition , 1998, Proc. IEEE.
[30] Sergio Verdú,et al. Approximation theory of output statistics , 1993, IEEE Trans. Inf. Theory.
[31] Alison L Gibbs,et al. On Choosing and Bounding Probability Metrics , 2002, math/0209021.
[32] C. Stein,et al. Estimation with Quadratic Loss , 1992 .
[33] Shotaro Akaho,et al. A kernel method for canonical correlation analysis , 2006, ArXiv.
[34] Ian T. Jolliffe,et al. Principal Component Analysis , 2002, International Encyclopedia of Statistical Science.
[35] Olivier Ledoit,et al. A well-conditioned estimator for large-dimensional covariance matrices , 2004 .
[36] Lizhong Zheng,et al. Euclidean Information Theory , 2008, 2008 IEEE International Zurich Seminar on Communications.
[37] D. Brillinger. Time series - data analysis and theory , 1981, Classics in applied mathematics.
[38] Arindam Banerjee,et al. Probabilistic Semi-Supervised Clustering with Constraints , 2006, Semi-Supervised Learning.
[39] Douglas B. Terry,et al. Using collaborative filtering to weave an information tapestry , 1992, CACM.
[40] Meir Feder,et al. An Information-Theoretic Framework for Non-linear Canonical Correlation Analysis , 2018, ArXiv.
[41] Naftali Tishby,et al. Deep learning and the information bottleneck principle , 2015, 2015 IEEE Information Theory Workshop (ITW).
[42] J. Friedman,et al. Estimating Optimal Transformations for Multiple Regression and Correlation. , 1985 .
[43] Kenneth Rose,et al. An Information-theoretic Learning Algorithm for Neural Network Classification , 1995, NIPS.
[44] P. Gács,et al. Spreading of Sets in Product Spaces and Hypercontraction of the Markov Operator , 1976 .
[45] Kenneth Ward Church,et al. Word Association Norms, Mutual Information, and Lexicography , 1989, ACL.
[46] T. W. Anderson. An Introduction to Multivariate Statistical Analysis , 1959 .
[47] H. Hotelling. Analysis of a complex of statistical variables into principal components. , 1933 .
[48] Kurt Hornik,et al. Multilayer feedforward networks are universal approximators , 1989, Neural Networks.
[49] David Slepian,et al. On the Symmetrized Kronecker Power of a Matrix and Extensions of Mehler’s Formula for Hermite Polynomials , 1972 .
[50] Wm. R. Wright. General Intelligence, Objectively Determined and Measured. , 1905 .
[51] Emmanuel J. Candès,et al. A Probabilistic and RIPless Theory of Compressed Sensing , 2010, IEEE Transactions on Information Theory.
[52] Xiaodong Li,et al. Dense error correction for low-rank matrices via Principal Component Pursuit , 2010, 2010 IEEE International Symposium on Information Theory.
[53] Joel A. Tropp,et al. User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..
[54] Aaron D. Wyner,et al. The common information of two dependent random variables , 1975, IEEE Trans. Inf. Theory.
[55] Saharon Rosset,et al. Generalized Independent Component Analysis Over Finite Alphabets , 2016, IEEE Trans. Inf. Theory.
[56] Naftali Tishby,et al. The information bottleneck method , 2000, ArXiv.
[57] Imre Csiszár,et al. Information Theory and Statistics: A Tutorial , 2004, Found. Trends Commun. Inf. Theory.
[58] Ken R. Duffy,et al. Principal Inertia Components and Applications , 2017, IEEE Transactions on Information Theory.
[59] Emmanuel J. Candès,et al. Matrix Completion With Noise , 2009, Proceedings of the IEEE.
[60] Yehuda Koren,et al. Matrix Factorization Techniques for Recommender Systems , 2009, Computer.
[61] F. G. Mehler. Ueber die Entwicklung einer Function von beliebig vielen Variablen nach Laplaceschen Functionen höherer Ordnung. , 1866 .
[62] Erkki Oja,et al. A class of neural networks for independent component analysis , 1997, IEEE Trans. Neural Networks.
[63] Gal Chechik,et al. Information Bottleneck for Gaussian Variables , 2003, J. Mach. Learn. Res..
[64] Kilian Q. Weinberger,et al. Spectral Methods for Dimensionality Reduction , 2006, Semi-Supervised Learning.
[65] Lizhong Zheng,et al. Polynomial Singular Value Decompositions of a Family of Source-Channel Models , 2017, IEEE Transactions on Information Theory.
[66] Shao-Lun Huang,et al. An information-theoretic approach to universal feature selection in high-dimensional inference , 2017, 2017 IEEE International Symposium on Information Theory (ISIT).
[67] Huda Khayrallah,et al. Deep Generalized Canonical Correlation Analysis , 2017, RepL4NLP@ACL.
[68] Stefano Soatto,et al. Emergence of invariance and disentangling in deep representations , 2017 .
[69] Nathan Srebro,et al. Learning with matrix factorizations , 2004 .
[70] E. Oja. Simplified neuron model as a principal component analyzer , 1982, Journal of mathematical biology.
[71] B. L. Roux,et al. Geometric Data Analysis: From Correspondence Analysis to Structured Data Analysis , 2004 .
[72] M. Kramer. Nonlinear principal component analysis using autoassociative neural networks , 1991 .
[73] Jim Kay,et al. Canonical Correlation Analysis Using a Neural Network , 1992 .
[74] Alexander Basilevsky,et al. Statistical Factor Analysis and Related Methods , 1994 .
[75] Venkat Anantharam,et al. On Maximal Correlation, Hypercontractivity, and the Data Processing Inequality studied by Erkip and Cover , 2013, ArXiv.
[76] George Cybenko,et al. Approximation by superpositions of a sigmoidal function , 1989, Math. Control. Signals Syst..
[77] Sophie Ahrens,et al. Recommender Systems , 2012 .
[78] N. L. Johnson,et al. Linear Statistical Inference and Its Applications , 1966 .
[79] W. Rudin. Principles of mathematical analysis , 1964 .
[80] Andrew R. Barron,et al. Universal approximation bounds for superpositions of a sigmoidal function , 1993, IEEE Trans. Inf. Theory.
[81] I. Csiszár. A class of measures of informativity of observation channels , 1972 .
[82] G. Stewart. Perturbation theory for the singular value decomposition , 1990 .
[83] A. Lewis. The Convex Analysis of Unitarily Invariant Matrix Functions , 1995 .
[84] Duolao Wang,et al. Estimating Optimal Transformations for Multiple Regression Using the ACE Algorithm , 2004, Journal of Data Science.
[85] R. W. Wedderburn,et al. Generalized Linear Models Specified in Terms of Constraints , 1974 .
[86] Emmanuel J. Candès,et al. Exact Matrix Completion via Convex Optimization , 2008, Found. Comput. Math..
[87] Pedro J. Zufiria,et al. Generalized neural networks for spectral analysis: dynamics and Liapunov functions , 2004, Neural Networks.
[88] Shao-Lun Huang,et al. Gaussian Universal Features, Canonical Correlations, and Common Information , 2018, 2018 IEEE Information Theory Workshop (ITW).
[89] Lizhong Zheng,et al. A Coordinate System for Gaussian Networks , 2010, IEEE Transactions on Information Theory.
[90] Yihong Wu,et al. Strong data-processing inequalities for channels and Bayesian networks , 2015, 1508.06025.
[91] C. Anderson‐Cook,et al. An Introduction to Multivariate Statistical Analysis (3rd ed.) (Book) , 2004 .
[92] D. Cox. The Regression Analysis of Binary Sequences , 2017 .
[93] James Bennett,et al. The Netflix Prize , 2007 .
[94] Lizhong Zheng,et al. Polynomial spectral decomposition of conditional expectation operators , 2016, 2016 54th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[95] H. Hotelling. Relations Between Two Sets of Variates , 1936 .
[96] D. Rubin,et al. Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .
[97] Daniel M. Roy,et al. Neural Network Matrix Factorization , 2015, ArXiv.
[98] Charles R. Johnson,et al. Matrix Analysis, 2nd Ed , 2012 .
[99] M. Veloso,et al. Latent Variable Models , 2019, Statistical and Econometric Methods for Transportation Data Analysis.
[100] Jon Atli Benediktsson,et al. Linear Versus Nonlinear PCA for the Classification of Hyperspectral Data Based on the Extended Morphological Profiles , 2012, IEEE Geoscience and Remote Sensing Letters.
[101] Lizhong Zheng,et al. Probabilistic Clustering using Maximal Matrix Norm Couplings , 2018, 2018 56th Annual Allerton Conference on Communication, Control, and Computing (Allerton).
[102] M. Haber. Maximum likelihood methods for linear and log-linear models in categorical data , 1985 .
[103] Lizhong Zheng,et al. Linear Bounds between Contraction Coefficients for $f$-Divergences , 2015, 1510.01844.
[104] Karen Livescu,et al. Nonparametric Canonical Correlation Analysis , 2015, ICML.
[105] Erkki Oja,et al. The nonlinear PCA learning rule in independent component analysis , 1997, Neurocomputing.
[106] Shao-Lun Huang,et al. Linear information coupling problems , 2012, 2012 IEEE International Symposium on Information Theory Proceedings.
[107] Paul Resnick,et al. Recommender systems , 1997, CACM.
[108] Emmanuel J. Candès,et al. The Power of Convex Relaxation: Near-Optimal Matrix Completion , 2009, IEEE Transactions on Information Theory.
[109] Naftali Tishby,et al. Opening the Black Box of Deep Neural Networks via Information , 2017, ArXiv.
[110] G. W. STEWARTt. ON THE EARLY HISTORY OF THE SINGULAR VALUE DECOMPOSITION * , 2022 .
[111] A. Izenman. Reduced-rank regression for the multivariate linear model , 1975 .
[112] Xiangxiang Xu,et al. On The Sample Complexity of HGR Maximal Correlation Functions , 2019, 2019 IEEE Information Theory Workshop (ITW).
[113] M. Manser,et al. Chi-Squared Distribution , 2010 .
[114] H. Witsenhausen. ON SEQUENCES OF PAIRS OF DEPENDENT RANDOM VARIABLES , 1975 .
[115] Venkat Anantharam,et al. On hypercontractivity and a data processing inequality , 2014, 2014 IEEE International Symposium on Information Theory.
[116] C. Eckart,et al. The approximation of one matrix by another of lower rank , 1936 .
[117] A. Tsybakov,et al. Estimation of high-dimensional low-rank matrices , 2009, 0912.5338.
[118] Thomas A. Courtade,et al. Which Boolean functions are most informative? , 2013, 2013 IEEE International Symposium on Information Theory.
[119] Kurt Hornik,et al. Neural networks and principal component analysis: Learning from examples without local minima , 1989, Neural Networks.
[120] Naftali Tishby,et al. Data Clustering by Markovian Relaxation and the Information Bottleneck Method , 2000, NIPS.
[121] J. Leeuw,et al. The Gifi system of descriptive multivariate analysis , 1998 .
[122] H. J. Scudder,et al. Probability of error of some adaptive pattern-recognition machines , 1965, IEEE Trans. Inf. Theory.
[123] Pablo A. Parrilo,et al. Guaranteed Minimum-Rank Solutions of Linear Matrix Equations via Nuclear Norm Minimization , 2007, SIAM Rev..
[124] Sajid Javed,et al. Robust Subspace Learning: Robust PCA, Robust Subspace Tracking, and Robust Subspace Recovery , 2017, IEEE Signal Processing Magazine.
[125] Erkki Oja,et al. Principal components, minor components, and linear neural networks , 1992, Neural Networks.
[126] E. Schmidt. Zur Theorie der linearen und nichtlinearen Integralgleichungen , 1907 .
[127] Robert J. Plemmons,et al. Nonnegative Matrices in the Mathematical Sciences , 1979, Classics in Applied Mathematics.
[128] Meir Feder,et al. Binary independent component analysis: Theory, bounds and algorithms , 2016, 2016 IEEE 26th International Workshop on Machine Learning for Signal Processing (MLSP).
[129] David G. Stork,et al. Pattern Classification (2nd ed.) , 1999 .
[130] Alexander A. Alemi,et al. Deep Variational Information Bottleneck , 2017, ICLR.
[131] John Shawe-Taylor,et al. Canonical Correlation Analysis: An Overview with Application to Learning Methods , 2004, Neural Computation.
[132] Jean Ponce,et al. Convex Sparse Matrix Factorizations , 2008, ArXiv.
[133] A. J. Bell,et al. A Unifying Information-Theoretic Framework for Independent Component Analysis , 2000 .
[134] Xiaojin Zhu,et al. Semi-Supervised Learning , 2010, Encyclopedia of Machine Learning.