Large datasets: Segmentation, feature extraction, and compression

Large data sets with more than several mission multivariate observations (tens of megabytes or gigabytes of stored information) are difficult or impossible to analyze with traditional software. The amount of output which must be scanned quickly dilutes the ability of the investigator to confidently identify all the meaningful patterns and trends which may be present. The purpose of this project is to develop both a theoretical foundation and a collection of tools for automated feature extraction that can be easily customized to specific applications. Cluster analysis techniques are applied as a final step in the feature extraction process, which helps make data surveying simple and effective.

[1]  D. W. Scott,et al.  Multivariate Density Estimation, Theory, Practice and Visualization , 1992 .

[2]  Peter Hackl,et al.  Statistical analysis and forecasting of economic structural change , 1989 .

[3]  Werner G. Müller,et al.  Estimation and Experimental Design for Second Kind Regression Models , 1990 .

[4]  Howell Tong,et al.  Non-Linear Time Series , 1990 .

[5]  Daw,et al.  Role of low-pass filtering in the process of attractor reconstruction from experimental chaotic time series. , 1993, Physical review. E, Statistical physics, plasmas, fluids, and related interdisciplinary topics.

[6]  N. E. Clapp,et al.  Nonlinear analysis of EEG for epileptic seizures , 1995 .

[7]  E. S. Page CONTINUOUS INSPECTION SCHEMES , 1954 .

[8]  R. Kanwal Linear Integral Equations , 1925, Nature.

[9]  D. Ruelle,et al.  Ergodic theory of chaos and strange attractors , 1985 .

[10]  A. Kolmogorov,et al.  ALGORITHMS AND RANDOMNESS , 1988 .

[11]  A. Shen Algorithmic Complexity and Randomness: Recent Developments , 1993 .

[12]  P. Grassberger,et al.  NONLINEAR TIME SEQUENCE ANALYSIS , 1991 .

[13]  B. Brodsky,et al.  Nonparametric Methods in Change Point Problems , 1993 .

[14]  A. Raftery,et al.  Model-based Gaussian and non-Gaussian clustering , 1993 .

[15]  A. Madansky Identification of Outliers , 1988 .

[16]  K. Gordon,et al.  Modeling and Monitoring Biomedical Time Series , 1990 .

[17]  G. P. King,et al.  Extracting qualitative dynamics from experimental data , 1986 .

[18]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[19]  F. Takens Detecting strange attractors in turbulence , 1981 .

[20]  Jan Beran,et al.  Statistics for long-memory processes , 1994 .

[21]  E. S. Page A test for a change in a parameter occurring at an unknown point , 1955 .

[22]  M. West,et al.  Bayesian forecasting and dynamic models , 1989 .

[23]  L. Tsimring,et al.  The analysis of observed chaotic data in physical systems , 1993 .

[24]  George A. F. Seber,et al.  Linear regression analysis , 1977 .

[25]  George Ostrouchov,et al.  A method for detecting changes in long time series , 1995 .

[26]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .