Two essays in statistics: a prediction divergence criterion for model selection & wavelet variance based estimation of latent time series models

This thesis is divided in two parts. First, it presents a new criterion for model selection which is shown to be particularly well suited in "sparse" settings which we believe to be common in many research fields. Our selection procedure is developed for linear regression models, smoothing splines, autoregressive and mixed linear models. These developments are then applied in Biostatistics. The second part presents a new estimation method for the parameters of a time series model. The proposed estimation method offers an alternative to maximum likelihood estimation, that is straightforward to implement and often the only feasible estimation method with complex models. We derive the asymptotic properties of the proposed estimator for inference and perform an extensive simulation study to compare our estimator to existing methods. Finally, we apply our method in engineering to calibrate inertial sensors and demonstrate that it represents a considerable improvement compared to benchmark methods.

[1]  A. Walden,et al.  Wavelet scale analysis of bivariate time series i: motivation and estimation , 2000 .

[2]  B. G. Quinn,et al.  The determination of the order of an autoregression , 1979 .

[3]  S. Herndon,et al.  Detection of nitrogen dioxide by cavity attenuated phase shift spectroscopy. , 2005, Analytical chemistry.

[4]  I. Johnstone,et al.  Wavelet Threshold Estimators for Data with Correlated Noise , 1997 .

[5]  Allan variance for optimal signal averaging-monitoring by diode–laser and CO2 laser photo-acoustic spectroscopy , 2009 .

[6]  Yannick Stebler,et al.  Limits of the Allan Variance and Optimal Tuning of Wavelet Variance based Estimators , 2013 .

[7]  Mauro Gallegati,et al.  Wavelet Variance Analysis of Output in G-7 Countries , 2007 .

[8]  Andrew Harvey,et al.  Forecasting, Structural Time Series Models and the Kalman Filter. , 1991 .

[9]  A. M. Mathai Quadratic forms in random variables , 1992 .

[10]  D. Bates,et al.  Mixed-Effects Models in S and S-PLUS , 2001 .

[11]  D. W. Allan,et al.  Statistics of atomic frequency standards , 1966 .

[12]  H. Zou,et al.  Regularization and variable selection via the elastic net , 2005 .

[13]  L. Mátyás,et al.  Introduction to the generalized method of moments estimation , 1999 .

[14]  Timo Teräsvirta The Invertibility of Sums of Discrete MA and ARMA Processes , 1977 .

[15]  S. Klasen,et al.  The Nutritional Status of Elites in India, Kenya, and Zambia: An appropriate guide for developing reference standards for undernutrition? , 2000 .

[16]  H. Akaike Statistical predictor identification , 1970 .

[17]  G. Kerkyacharian,et al.  Density estimation in Besov spaces , 1992 .

[18]  Elvezio Ronchetti,et al.  Robust Indirect Inference , 2003 .

[19]  Samuel Kotz,et al.  The Laplace Distribution and Generalizations: A Revisit with Applications to Communications, Economics, Engineering, and Finance , 2001 .

[20]  R. Nishii Asymptotic Properties of Criteria for Selection of Variables in Multiple Regression , 1984 .

[21]  G. Schwarz Estimating the Dimension of a Model , 1978 .

[22]  Marlene Müller Humboldt CONSISTENCY PROPERTIES OF MODEL SELECTION CRITERIA IN MULTIPLE LINEAR REGRESSION , 2007 .

[23]  Neil Bose,et al.  Applications of Autonomous Underwater Vehicles in Offshore Petroleum Industry Environmental Effects Monitoring , 2007 .

[24]  G. Nason,et al.  Wavelet processes and adaptive estimation of the evolutionary wavelet spectrum , 2000 .

[25]  T. Spies,et al.  Characterizing canopy gap structure in forests using wavelet analysis , 1992 .

[26]  D. W. Allan,et al.  Time and Frequency (Time-Domain) Characterization, Estimation, and Prediction of Precision Clocks and Oscillators , 1987, IEEE Transactions on Ultrasonics, Ferroelectrics and Frequency Control.

[27]  Ludwig Fahrmeir,et al.  Semiparametric Analysis of the Socio-Demographic and Spatial Determinants of Undernutrition in Two African Countries , 2001 .

[28]  A. Brandes,et al.  Circadian Profile of Cardiac Autonomic Nervous Modulation in Healthy Subjects: , 2003, Journal of cardiovascular electrophysiology.

[29]  Pedro J. Boschetti,et al.  Design of an Unmanned Aerial Vehicle for Ecological Conservation , 2005 .

[30]  R. Shumway,et al.  AN APPROACH TO TIME SERIES SMOOTHING AND FORECASTING USING THE EM ALGORITHM , 1982 .

[31]  Linda Richter,et al.  Maternal and child undernutrition: consequences for adult health and human capital , 2008, The Lancet.

[32]  Robert H. Shumway,et al.  The model selection criterion AICu , 1997 .

[33]  M. Vannucci,et al.  Wavelet Packet Methods for the Analysis of Variance of Time Series With Application to Crack Widths on the Brunelleschi Dome , 2004 .

[34]  A. W. Kelman,et al.  Compartmental models and their application , 1985 .

[35]  C. Kramer,et al.  Optimization of heterodyne observations using Allan variance measurements , 2001, astro-ph/0105071.

[36]  A. McQuarrie,et al.  Regression and Time Series Model Selection , 1998 .

[37]  D. Haughton On the Choice of a Model to Fit Data from an Exponential Family , 1988 .

[38]  Nunzio Abbate,et al.  Development of a MEMS based wearable motion capture system , 2009, 2009 2nd Conference on Human System Interactions.

[39]  Maki Tachikawa,et al.  Allan-variance measurements of diode laser frequency-stabilized with a thin vapor cell , 2003 .

[40]  Peter Guttorp,et al.  Wavelet analysis of covariance with application to atmospheric time series , 2000 .

[41]  I. Miller Probability, Random Variables, and Stochastic Processes , 1966 .

[42]  N. Madise,et al.  Heterogeneity of child nutritional status between households: A comparison of six sub-Saharan African countries , 1999 .

[43]  B. MacGibbon,et al.  Non‐parametric Curve Estimation by Wavelet Thresholding with Locally Stationary Errors , 2000 .

[44]  Stéphane Guerrier,et al.  Improving Accuracy with Multiple Sensors: Study of Redundant MEMS-IMU/GPS Configurations , 2009 .

[45]  H. Akaike A new look at the statistical model identification , 1974 .

[46]  Todd R. Ogden,et al.  Wavelet Methods for Time Series Analysis , 2002 .

[47]  G. Walter Approximation of the delta function by wavelets , 1992 .

[48]  Dean P. Foster,et al.  The risk inflation criterion for multiple regression , 1994 .

[49]  Tom Wansbeek,et al.  Identification in parametric models , 2001 .

[50]  Ping Zhang On the Distributional Properties of Model Selection Criteria , 1992 .

[51]  Ping Zhang On the convergence rate of model selection criteria , 1993 .

[52]  Jan Skaloud,et al.  Study of MEMS-based inertial sensors operating in dynamic conditions , 2014, 2014 IEEE/ION Position, Location and Navigation Symposium - PLANS 2014.

[53]  Steven D. Sargent,et al.  Tunable diode laser absorption spectroscopy for stable isotope studies of ecosystem–atmosphere CO2 exchange , 2003 .

[54]  S. Mallat A wavelet tour of signal processing , 1998 .

[55]  Limit theorems for bivariate Appell polynomials. Part I: Central limit theorems , 1997 .

[56]  R. Bhansali,et al.  Some properties of the order of an autoregressive model selected by a generalization of Akaike∘s EPF criterion , 1977 .

[57]  Sai-Ming Li,et al.  Forest fire monitoring with multiple small UAVs , 2005, Proceedings of the 2005, American Control Conference, 2005..

[58]  L. Hansen Large Sample Properties of Generalized Method of Moments Estimators , 1982 .

[59]  Michael A. Goodrich,et al.  Supporting wilderness search and rescue using a camera‐equipped mini UAV , 2008, J. Field Robotics.

[60]  Ramón Díaz-Uriarte,et al.  Gene selection and classification of microarray data using random forest , 2006, BMC Bioinformatics.

[61]  A. Dasgupta Asymptotic Theory of Statistics and Probability , 2008 .

[62]  Ivana Komunjer GLOBAL IDENTIFICATION IN NONLINEAR MODELS WITH MOMENT RESTRICTIONS , 2012, Econometric Theory.

[63]  A. T. Craig,et al.  Note on the Independence of Certain Quadratic Forms , 1943 .

[64]  Mitchell H. Gail,et al.  A Delta Method for Implicitly Defined Random Variables , 1989 .

[65]  Nien Fan Zhang,et al.  Allan variance of time series models for measurement data , 2008 .

[66]  D. Ruppert,et al.  Likelihood ratio tests in linear mixed models with one variance component , 2003 .

[67]  Gerard L Gebber,et al.  Fractal noises and motions in time series of presympathetic and sympathetic neural activities. , 2006, Journal of neurophysiology.

[68]  N. Touzi,et al.  Calibrarion By Simulation for Small Sample Bias Correction , 1996 .

[69]  Terence Tao,et al.  The Dantzig selector: Statistical estimation when P is much larger than n , 2005, math/0506081.

[70]  L. Breiman Better subset regression using the nonnegative garrote , 1995 .

[71]  L. Isserlis ON A FORMULA FOR THE PRODUCT-MOMENT COEFFICIENT OF ANY ORDER OF A NORMAL FREQUENCY DISTRIBUTION IN ANY NUMBER OF VARIABLES , 1918 .

[72]  Xiaoji Niu,et al.  Analysis and Modeling of Inertial Sensors Using Allan Variance , 2008, IEEE Transactions on Instrumentation and Measurement.

[73]  J. Skaloud,et al.  A framework for inertial sensor calibration using complex stochastic error models , 2012, Proceedings of the 2012 IEEE/ION Position, Location and Navigation Symposium.

[74]  C. Greenhall Recipes for degrees of freedom of frequency stability estimators , 1991 .

[75]  T. Hastie,et al.  Classification of gene microarrays by penalized logistic regression. , 2004, Biostatistics.

[76]  P. Werle,et al.  The limits of signal averaging in atmospheric trace-gas monitoring by tunable diode-laser absorption spectroscopy (TDLAS) , 1993 .

[77]  James Found Children's Fund , 2010 .

[78]  D. Rubin,et al.  Maximum likelihood from incomplete data via the EM - algorithm plus discussions on the paper , 1977 .

[79]  Xiaoji Niu,et al.  A Universal Approach for Processing any MEMS Inertial Sensor Configuration for Land-Vehicle Navigation , 2007, Journal of Navigation.

[80]  J. Rud Nielsen,et al.  Elementary Statistical Physics. , 1959 .

[81]  Ian Jones,et al.  A Comparison of Video and Accelerometer Based Approaches Applied to Performance Monitoring in Swimming , 2009 .

[82]  Donald B. Percival,et al.  The discrete wavelet transform and the scale analysis of the surface properties of sea ice , 1996, IEEE Trans. Geosci. Remote. Sens..

[83]  Ismael Colomina,et al.  Dynamic dependent IMU stochastic modeling for enhanced INS/GNSS navigation , 2010, 2010 5th ESA Workshop on Satellite Navigation Technologies and European Workshop on GNSS Signals and Signal Processing (NAVITEC).

[84]  Andrew T. Walden,et al.  Wavelet scale analysisof bivariate time series ii:statistical properties for linear processes , 2000 .

[85]  J. Ware,et al.  Random-effects models for longitudinal data. , 1982, Biometrics.

[86]  C. Torrence,et al.  A Practical Guide to Wavelet Analysis. , 1998 .

[87]  A. Gallant,et al.  Which Moments to Match? , 1995, Econometric Theory.

[88]  J. Shao AN ASYMPTOTIC THEORY FOR LINEAR MODEL SELECTION , 1997 .

[90]  John Weston,et al.  Strapdown Inertial Navigation Technology , 1997 .

[91]  M. Abramowitz,et al.  Handbook of Mathematical Functions With Formulas, Graphs and Mathematical Tables (National Bureau of Standards Applied Mathematics Series No. 55) , 1965 .

[92]  M. Woodroofe On Model Selection and the ARC Sine Laws , 1982 .

[93]  C L Feldman,et al.  Determinants of heart rate variability. , 1996, Journal of the American College of Cardiology.

[94]  I. Moustaki,et al.  Bounded-Influence Robust Estimation in Generalized Linear Latent Variable Models , 2004 .

[95]  William J. McCausland,et al.  Simulation smoothing for state-space models: A computational efficiency analysis , 2011, Comput. Stat. Data Anal..

[96]  J. Sztajzel Heart rate variability: a noninvasive electrocardiographic method to measure the autonomic nervous system. , 2004, Swiss medical weekly.

[97]  Yannick Stebler Modeling and Processing Approaches for Integrated Inertial Navigation , 2013 .

[98]  Jan Skaloud,et al.  Integrity Aspects of Hybrid EGNOS-based Navigation on Support of Search And-Rescue Missions with UAVs , 2011 .

[99]  Oliver J. Woodman,et al.  An introduction to inertial navigation , 2007 .

[100]  D. Donoho Nonlinear Solution of Linear Inverse Problems by Wavelet–Vaguelette Decomposition , 1995 .

[101]  Brandon J. Whitcher,et al.  Wavelet-Based Estimation for Seasonal Long-Memory Processes , 2004, Technometrics.

[102]  F. Vaida,et al.  Conditional Akaike information for mixed-effects models , 2005 .

[103]  S. Greven,et al.  On the behaviour of marginal and conditional AIC in linear mixed models , 2010 .

[104]  Calyampudi R. Rao,et al.  Linear statistical inference and its applications , 1965 .

[105]  E. George The Variable Selection Problem , 2000 .

[106]  J. Shao Linear Model Selection by Cross-validation , 1993 .

[107]  Anthony A. Smith,et al.  Estimating Nonlinear Time-Series Models Using Simulated Vector Autoregressions , 1993 .

[108]  Yuhong Yang Can the Strengths of AIC and BIC Be Shared , 2005 .

[109]  R. M. Lark,et al.  Changes in variance and correlation of soil properties with scale and location: analysis using an adapted maximal overlap discrete wavelet transform , 2001 .

[110]  R. Tibshirani,et al.  The Covariance Inflation Criterion for Adaptive Model Selection , 1999 .

[111]  Adrian F. M. Smith,et al.  Bayesian Analysis of Linear and Non‐Linear Population Models by Using the Gibbs Sampler , 1994 .

[112]  P. Phillips Understanding spurious regressions in econometrics , 1986 .

[113]  Jan Skaloud,et al.  Generalized method of wavelet moments for inertial navigation filter design , 2014, IEEE Transactions on Aerospace and Electronic Systems.

[114]  P. Phillips Testing for a Unit Root in Time Series Regression , 1988 .

[115]  Gregg S. Pressman,et al.  Impact of obesity on total and cardiovascular mortality—fat or fiction? , 2011, Nature Reviews Cardiology.

[116]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[117]  Murad S. Taqqu,et al.  Central limit theorems for quadratic forms with time-domain conditions , 1998 .

[118]  Roberto Oboe,et al.  Motion reconstruction with a low-cost MEMS IMU for the automation of human operated specimen manipulation , 2011, 2011 IEEE International Symposium on Industrial Electronics.

[119]  C. Stephensen Burden of infection on growth failure. , 1999, The Journal of nutrition.

[120]  R. Tibshirani Regression Shrinkage and Selection via the Lasso , 1996 .

[121]  R. Tibshirani,et al.  Diagnosis of multiple cancer types by shrunken centroids of gene expression , 2002, Proceedings of the National Academy of Sciences of the United States of America.

[122]  Dean P. Foster,et al.  Variable Selection in Data Mining , 2004 .

[123]  K. Liang,et al.  Asymptotic Properties of Maximum Likelihood Estimators and Likelihood Ratio Tests under Nonstandard Conditions , 1987 .

[124]  Jan Skaloud,et al.  Constrained expectation-maximization algorithm for stochastic inertial error modeling: study of feasibility , 2011 .

[125]  J. Mesirov,et al.  Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. , 1999, Science.

[126]  Clive W. J. Granger,et al.  Time Series Modelling and Interpretation , 1976 .

[127]  Ivan Nestorov,et al.  Whole Body Pharmacokinetic Models , 2003, Clinical pharmacokinetics.

[128]  B. Efron Estimating the Error Rate of a Prediction Rule: Improvement on Cross-Validation , 1983 .

[129]  A. Walden,et al.  Statistical Properties and Uses of the Wavelet Variance Estimator for the Scale Analysis of Time Series , 2000 .

[130]  D. Stram,et al.  Variance components testing in the longitudinal mixed effects model. , 1994, Biometrics.

[131]  Peter Guttorp,et al.  Long-Memory Processes, the Allan Variance and Wavelets , 1994 .

[132]  B. Laval,et al.  An Autonomous Underwater Vehicle for the Study of Small Lakes , 2000 .

[133]  Elizabeth E. Holmes,et al.  Using multivariate state-space models to study spatial structure and dynamics , 2008 .

[134]  Adrian Wägli,et al.  Trajectory determination and analysis in sports by satellite and inertial navigation , 2009 .

[135]  Peter Strobl,et al.  Monitoring of gas pipelines - a civil UAV application , 2005 .

[136]  C. Crainiceanu,et al.  Restricted Likelihood Ratio Testing for Zero Variance Components in Linear Mixed Models , 2008 .

[137]  C. J. Stone,et al.  Consistent Nonparametric Regression , 1977 .

[138]  Jan Skaloud,et al.  Wavelet-Variance-Based Estimation for Composite Stochastic Processes , 2013 .

[139]  Henry W. Loescher,et al.  Comparison of temperature and wind statistics in contrasting environments among different sonic anemometer–thermometers , 2005 .

[140]  GEORGE PIERCE JONES,et al.  An Assessment of Small Unmanned Aerial Vehicles for Wildlife Research , 2006 .

[141]  L. Bregman The relaxation method of finding the common point of convex sets and its application to the solution of problems in convex programming , 1967 .

[142]  Xiaodong Zheng,et al.  A CONSISTENT VARIABLE SELECTION CRITERION FOR LINEAR MODELS WITH HIGH-DIMENSIONAL COVARIATES , 1997 .