Sparse Recovery And Deep Learning For Extracting Time-Dependent Models From Data

Training machines to extract useful information from data and to perform data processing tasks is one of the most elusive and long-standing challenges in engineering and artificial intelligence. This thesis consists of three works in two branches of learning. One is sparse learning, which involves a sparse linear system and (typically) a convex optimization problem. The other is deep learning,which includes a nonconvex optimization problem related to a deep neural network. The first work [90] concerns extracting structured dynamics using sparsity. Learning governing equations allows for deeper understanding of the structure and dynamics of the data. We present arandom sampling method for learning structured dynamical systems from undersampled and possibly noisy state-space measurements. The learning problem takes the form of a sparse least-squares fitting over a large set of candidate functions. Based on a Bernstein-like inequality for partiallydependent random variables, we provide theoretical guarantees on the recovery rate of the sparse coefficients and the identification of the candidate functions for the corresponding problem. Computational results are demonstrated on datasets generated by the Lorenz-96 equation, the viscous Burgers’ equation, and the two-component reaction-diffusion equations. This formulation has several advantages including: ease of use, theoretical guarantees of success, and computational efficiencywith respect to ambient dimension and number of candidate functions. The second work [72] discusses the convergence of a sparsity promoting algorithm. One way to understand time-series data is to identify the underlying dynamical system which generates it. This task can be done by selecting an appropriate model and a set of parameters which best fits the dynamics while providing the simplest representation. One such approach is the sparse identification of nonlinear dynamics framework [11] which uses a sparsity-promoting algorithm that iteratesbetween a partial least-squares fit and a thresholding (sparsity-promoting) step. We provide some theoretical results on the behavior and convergence of the algorithm proposed in [11]. In particular, we prove that the algorithm approximates local minimizers of an unconstrained `0-penalized least-squares problem. From this, we provide sufficient conditions for general convergence, rate ofconvergence, and conditions for one-step recovery. Examples illustrate that the rates of convergenceare sharp. In addition, our results extend to other algorithms related to the algorithm in [11], and provide theoretical verification to several observed phenomena. The third work [89] focuses on the stability and structure in ResNet with variants. The residual neural network (ResNet) is a popular deep network architecture which has the ability to obtain highaccuracy results on several image processing problems. In order to analyze the behavior and structure of ResNet, recent work has been on establishing connections between ResNets and continuous-time optimal control problems. In this work, we show that the post-activation ResNet is related to an optimal control problem with differential inclusions, and provide continuous-time stability results for the differential inclusion associated with ResNet. Motivated by the stability conditions, we show that alterations of either the architecture or the optimization problem can generate variants of ResNet which improve the theoretical stability bounds. In addition, we establish stability bounds for the full (discrete) network associated with two variants of ResNet, in particular, bounds on the growth of the features and a measure of the sensitivity of the features with respect to perturbations.These results also help to show the relationship between the depth, regularization, and stability of the feature space. Computational experiments on the proposed variants show that the accuracy of ResNet is preserved and that the accuracy seems to be monotone with respect to the depth and various corruptions.