Online logistic regression on manifolds

This paper describes a new method for online logistic regression when the feature vectors lie close to a low-dimensional manifold and when observations of the feature vectors may be noisy or have missing elements. The new method exploits the low-dimensional structure of the feature vector, finds a multi-scale union of linear subsets that approximates the manifold, and performs online logistic regression separately on each subset. The union of subsets enables better performance in the face of noisy and missing data, and offsets challenges associated with the curse of dimensionality. The effectiveness of the proposed method in predicting correct labels of the data and in adapting to slowly time-varying manifolds are demonstrated using numerical examples and real data.

[1]  Yixin Chen,et al.  Compression and Aggregation for Logistic Regression Analysis in Data Cubes , 2009, IEEE Transactions on Knowledge and Data Engineering.

[2]  Rebecca Willett,et al.  Change-Point Detection for High-Dimensional Time Series With Missing Data , 2012, IEEE Journal of Selected Topics in Signal Processing.

[3]  D. Donoho CART AND BEST-ORTHO-BASIS: A CONNECTION' , 1997 .

[4]  Robert D. Nowak,et al.  High-Rank Matrix Completion and Subspace Clustering with Missing Data , 2011, ArXiv.

[5]  David W. Hosmer,et al.  Applied Logistic Regression , 1991 .

[6]  Michael I. Jordan,et al.  Regression on manifolds using kernel dimension reduction , 2007, ICML '07.

[7]  O. N. Garcia,et al.  Knowledge and Data Engineering: An Outlook , 1989 .

[8]  William D. Penny,et al.  Dynamic logistic regression , 1999, IJCNN'99. International Joint Conference on Neural Networks. Proceedings (Cat. No.99CH36339).

[9]  Stephen P. Boyd,et al.  An Interior-Point Method for Large-Scale l1-Regularized Logistic Regression , 2007, J. Mach. Learn. Res..

[10]  Matthew J. Streeter,et al.  Open Problem: Better Bounds for Online Logistic Regression , 2012, COLT.

[11]  Li Shen,et al.  Dimension reduction-based penalized logistic regression for cancer classification using microarray data , 2005, IEEE/ACM Transactions on Computational Biology and Bioinformatics.

[12]  Alan Agresti,et al.  Categorical Data Analysis , 1991, International Encyclopedia of Statistical Science.