Introduction to High-Dimensional Statistics

Preface Acknowledgments Introduction High-Dimensional Data Curse of Dimensionality Lost in the Immensity of High-Dimensional Spaces Fluctuations Cumulate An Accumulation of Rare Events May Not Be Rare Computational Complexity High-Dimensional Statistics Circumventing the Curse of Dimensionality A Paradigm Shift Mathematics of High-Dimensional Statistics About This Book Statistics and Data Analysis Purpose of This Book Overview Discussion and References Take-Home Message References Exercises Strange Geometry of High-Dimensional Spaces Volume of a p-Dimensional Ball Tails of a Standard Gaussian Distribution Principal Component Analysis Basics of Linear Regression Concentration of the Square Norm of a Gaussian Random Variable Model Selection Statistical Setting To Select among a Collection of Models Models and Oracle Model Selection Procedures Risk Bound for Model Selection Oracle Risk Bound Optimality Minimax Optimality Frontier of Estimation in High Dimensions Minimal Penalties Computational Issues Illustration An Alternative Point of View on Model Selection Discussion and References Take-Home Message References Exercises Orthogonal Design Risk Bounds for the Different Sparsity Settings Collections of Nested Models Segmentation with Dynamic Programming Goldenshluger-Lepski Method Minimax Lower Bounds Aggregation of Estimators Introduction Gibbs Mixing of Estimators Oracle Risk Bound Numerical Approximation by Metropolis-Hastings Numerical Illustration Discussion and References Take-Home Message References Exercises Gibbs Distribution Orthonormal Setting with Power Law Prior Group-Sparse Setting Gain of Combining Online Aggregation Convex Criteria Reminder on Convex Multivariate Functions Subdifferentials Two Useful Properties Lasso Estimator Geometric Insights Analytic Insights Oracle Risk Bound Computing the Lasso Estimator Removing the Bias of the Lasso Estimator Convex Criteria for Various Sparsity Patterns Group-Lasso (Group Sparsity) Sparse-Group Lasso (Sparse-Group Sparsity) Fused-Lasso (Variation Sparsity) Discussion and References Take-Home Message References Exercises When Is the Lasso Solution Unique? Support Recovery via the Witness Approach Lower Bound on the Compatibility Constant On the Group-Lasso Dantzig Selector Projection on the l1-Ball Ridge and Elastic-Net Estimator Selection Estimator Selection Cross-Validation Techniques Complexity Selection Techniques Coordinate-Sparse Regression Group-Sparse Regression Multiple Structures Scaled-Invariant Criteria References and Discussion Take-Home Message References Exercises Expected V-Fold CV l2-Risk Proof of Corollary 5.5 Some Properties of Penalty (5.4) Selecting the Number of Steps for the Forward Algorithm Multivariate Regression Statistical Setting A Reminder on Singular Values Low-Rank Estimation If We Knew the Rank of A* When the Rank of A* Is Unknown Low Rank and Sparsity Row-Sparse Matrices Criterion for Row-Sparse and Low-Rank Matrices Convex Criterion for Low Rank Matrices Convex Criterion for Sparse and Low-Rank Matrices Discussion and References Take-Home Message References Exercises Hard-Thresholding of the Singular Values Exact Rank Recovery Rank Selection with Unknown Variance Graphical Models Reminder on Conditional Independence Graphical Models Directed Acyclic Graphical Models Nondirected Models Gaussian Graphical Models (GGM) Connection with the Precision Matrix and the Linear Regression Estimating g by Multiple Testing Sparse Estimation of the Precision Matrix Estimation of g by Regression Practical Issues Discussion and References Take-Home Message References Exercises Factorization in Directed Models Moralization of a Directed Graph Convexity of -log(det(K)) Block Gradient Descent with the l1 / l2 Penalty Gaussian Graphical Models with Hidden Variables Dantzig Estimation of Sparse Gaussian Graphical Models Gaussian Copula Graphical Models Restricted Isometry Constant for Gaussian Matrices Multiple Testing An Introductory Example Differential Expression of a Single Gene Differential Expression of Multiple Genes Statistical Setting p-Values Multiple Testing Setting Bonferroni Correction Controlling the False Discovery Rate Heuristics Step-Up Procedures FDR Control under the WPRDS Property Illustration Discussion and References Take-Home Message References Exercises FDR versus FWER WPRDS Property Positively Correlated Normal Test Statistics Supervised Classification Statistical Modeling Bayes Classifier Parametric Modeling Semi-Parametric Modeling Nonparametric Modeling Empirical Risk Minimization Misclassification Probability of the Empirical Risk Minimizer Vapnik-Chervonenkis Dimension Dictionary Selection From Theoretical to Practical Classifiers Empirical Risk Convexification Statistical Properties Support Vector Machines AdaBoost Classifier Selection Discussion and References Take-Home Message References Exercises Linear Discriminant Analysis VC Dimension of Linear Classifiers in Rd Linear Classifiers with Margin Constraints Spectral Kernel Computation of the SVM Classifier Kernel Principal Component Analysis (KPCA) Gaussian Distribution Gaussian Random Vectors Chi-Square Distribution Gaussian Conditioning Probabilistic Inequalities Basic Inequalities Concentration Inequalities McDiarmid Inequality Gaussian Concentration Inequality Symmetrization and Contraction Lemmas Symmetrization Lemma Contraction Principle Birge's Inequality Linear Algebra Singular Value Decomposition (SVD) Moore-Penrose Pseudo-Inverse Matrix Norms Matrix Analysis Subdifferentials of Convex Functions Subdifferentials and Subgradients Examples of Subdifferentials Reproducing Kernel Hilbert Spaces Notations Bibliography Index