Dimensionality Reduction for Nonlinear Regression with Two Predictor Vectors

Many variables that we would like to predict depend nonlinearly on two types of attributes. For example, prices are influenced by supply and demand. Movie ratings are determined by demographic attributes and genre attributes. This paper addresses the dimensionality reduction problem in such regression problems with two predictor vectors. In particular, we assume a discriminative model where low-dimensional linear embeddings of the two predictor vectors are sufficient statistics for predicting a dependent variable. We show that a simple algorithm involving singular value decomposition can accurately estimate the embeddings provided that certain sample complexities are satisfied, surprisingly, without specifying the nonlinear regression model. These embeddings improve the efficiency and robustness of subsequent training, and can serve as a pre-training algorithm for neural networks. The main results establish sample complexities under multiple settings. Sample complexities for different regression models only differ by constant factors.

[1]  Ker-Chau Li,et al.  Sliced Inverse Regression for Dimension Reduction , 1991 .

[2]  L. Kilian Not All Oil Price Shocks are Alike: Disentangling Demand and Supply Shocks in the Crude Oil Market , 2006 .

[3]  Yehuda Koren,et al.  Matrix Factorization Techniques for Recommender Systems , 2009, Computer.

[4]  Yoram Bresler,et al.  Near Optimal Compressed Sensing of Sparse Rank-One Matrices via Sparse Power Factorization , 2013, ArXiv.

[5]  R. Cook,et al.  Sufficient Dimension Reduction via Inverse Regression , 2005 .

[6]  S T Roweis,et al.  Nonlinear dimensionality reduction by locally linear embedding. , 2000, Science.

[7]  Y. Plan,et al.  High-dimensional estimation with geometric constraints , 2014, 1404.3749.

[8]  Ker-Chau Li,et al.  On Principal Hessian Directions for Data Visualization and Dimension Reduction: Another Application of Stein's Lemma , 1992 .

[9]  R. Cook Save: a method for dimension reduction and graphics in regression , 2000 .

[10]  Yanjun Li,et al.  Blind Recovery of Sparse Signals From Subsampled Convolution , 2015, IEEE Transactions on Information Theory.

[11]  Francesc Moreno-Noguer,et al.  BreakingNews: Article Annotation by Image and Text Processing , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[12]  Joel A. Tropp,et al.  User-Friendly Tail Bounds for Sums of Random Matrices , 2010, Found. Comput. Math..

[13]  Prateek Jain,et al.  Phase Retrieval Using Alternating Minimization , 2013, IEEE Transactions on Signal Processing.

[14]  R. Cook,et al.  Coordinate-independent sparse sufficient dimension reduction and variable selection , 2010, 1211.3215.

[15]  Geoffrey E. Hinton,et al.  Reducing the Dimensionality of Data with Neural Networks , 2006, Science.

[16]  H. Tong,et al.  Article: 2 , 2002, European Financial Services Law.

[17]  Charu C. Aggarwal,et al.  Heterogeneous Network Embedding via Deep Architectures , 2015, KDD.