The Asymptotic Theory of Extreme Order Statistics
Abstract. Let X j denote the life length of the j th component of a machine. In reliability theory, one is interested in the life length Z n of the machine where n signifies its number of components. Evidently, Z n = min (X j : 1 ≤ j ≤ n). Another important problem, which is extensively discussed in the literature, is the service time W n of a machine with n components. If Y j is the time period required for servicing the j th component, then W n = max (Y j : 1 ≤ j ≤ n). In the early investigations, it was usually assumed that the X's or Y's are stochastically independent and identically distributed random variables. If n is large, then asymptotic theory is used for describing Z n or W n . Classical theory thus gives that the (asymptotic) distribution of these extremes (Z n or W n ) is of Weibull type. While the independence assumptions are practically never satisfied, data usually fits well the assumed Weibull distribution. This contradictory situation leads to the following mathematical problems: (i) What type of dependence property of the X's (or the Y's) will result in a Weibull distribution as the asymptotic law of Z n (or W n )? (ii) given the dependence structure of the X's (or Y's), what type of new asymptotic laws can be obtained for Z n (or W n )? The aim of the present paper is to analyze the recent development of the (mathematical) theory of the asymptotic distribution of extremes in the light of the questions (i) and (ii). Several dependence concepts will be introduced, each of which leads to a solution of (i). In regard to (ii), the following result holds: the class of limit laws of extremes for exchangeable variables is identical to the class of limit laws of extremes for arbitrary random variables. One can therefore limit attention to exchangeable variables. The basic references to this paper are the author's recent papers in Duke Math. J. 40 (1973), 581–586, J. Appl. Probability 10 (1973, 122–129 and 11 (1974), 219–222 and Zeitschrift fur Wahrscheinlichkeitstheorie 32 (1975), 197–207. For multivariate extensions see H. A. David and the author, J. Appl. Probability 11 (1974), 762–770 and the author's paper in J. Amer. Statist. Assoc. 70 (1975), 674–680. Finally, we shall point out the difficulty of distinguishing between several distributions based on data. Hence, only a combination of theoretical results and experimentations can be used as conclusive evidence on the laws governing the behavior of extremes.
A first course in order statistics
Basic Distribution Theory Discrete Order Statistics Order Statistics from Some Specific Distributions Moment Relations, Bounds, and Approximations Characterizations Using Order Statistics Order Statistics in Statistical Inference Asymptotic Theory Record Values Bibliography Indexes.
Understanding robust and exploratory data analysis
Stem-and-Leaf Displays (J. Emerson & D. Hoaglin). Letter Values: A Set of Selected Order Statistics (D. Hoaglin). Boxplots and Batch Comparison (J. Emerson & J. Strenio). Transforming Data (J. Emerson & M. Stoto). Resistant Lines for y Versus x (J. Emerson & D. Hoaglin). Analysis of Two-Way Tables by Medians (J. Emerson & D. Hoaglin). Examining Residuals (C. Goodall). Mathematical Aspects of Transformation (J. Emerson). Introduction to More Refined Estimators (D. Hoaglin, et al.). Comparing Location Estimators: Trimmed Means, Medians, and Trimean (J. Rosenberger & M. Gasko). M-Estimators of Location: An Outline of the Theory (C. Goodall). Robust Scale Estimators and Confidence Intervals for Location (B. Iglewicz). Index.
Finding Statistically Significant Communities in Networks
Community structure is one of the main structural features of networks, revealing both their internal organization and the similarity of their elementary units. Despite the large variety of methods proposed to detect communities in graphs, there is a big need for multi-purpose techniques, able to handle different types of datasets and the subtleties of community structure. In this paper we present OSLOM (Order Statistics Local Optimization Method), the first method capable to detect clusters in networks accounting for edge directions, edge weights, overlapping communities, hierarchies and community dynamics. It is based on the local optimization of a fitness function expressing the statistical significance of clusters with respect to random fluctuations, which is estimated with tools of Extreme and Order Statistics. OSLOM can be used alone or as a refinement procedure of partitions/covers delivered by other techniques. We have also implemented sequential algorithms combining OSLOM with other fast techniques, so that the community structure of very large networks can be uncovered. Our method has a comparable performance as the best existing algorithms on artificial benchmark graphs. Several applications on real networks are shown as well. OSLOM is implemented in a freely available software (http://www.oslom.org), and we believe it will be a valuable tool in the analysis of networks.
Information Theoretic Learning - Renyi's Entropy and Kernel Perspectives
This book presents the first cohesive treatment of Information Theoretic Learning (ITL) algorithms to adapt linear or nonlinear learning machines both in supervised or unsupervised paradigms. ITL is a framework where the conventional concepts of second order statistics (covariance, L2 distances, correlation functions) are substituted by scalars and functions with information theoretic underpinnings, respectively entropy, mutual information and correntropy. ITL quantifies the stochastic structure of the data beyond second order statistics for improved performance without using full-blown Bayesian approaches that require a much larger computational cost. This is possible because of a non-parametric estimator of Renyis quadratic entropy that is only a function of pairwise differences between samples. The book compares the performance of ITL algorithms with the second order counterparts in many engineering and machine learning applications. Students, practitioners and researchers interested in statistical signal processing, computational intelligence, and machine learning will find in this book the theory to understand the basics, the algorithms to implement applications, and exciting but still unexplored leads that will provide fertile ground for future research.
A first course in order statistics
Basic Distribution Theory Discrete Order Statistics Order Statistics from Some Specific Distributions Moment Relations, Bounds, and Approximations Characterizations Using Order Statistics Order Statistics in Statistical Inference Asymptotic Theory Record Values Bibliography Indexes.
Extreme value theory in engineering
Introduction and Motivation. Order Statistics. Asymptotic Distributions of Maxima and Minima (I.D. Case). Shortcut Procedures: Probability Papers and Least-Square Methods. The Gumbel, Weibull, and Frechet Distributions. Selection of Limit Distributions from Data. Limit Distributions of K-TH Order Statistics. Limit Distributions in the Case of Dependence. Multivariate and Regression Models Related to Extremes. Multivariate Extremes. Appendixes.
A generalization of median filtering using linear combinations of order statistics
We consider a class of nonlinear filters whose output is given by a linear combination of the order statistics of the input sequence. Assuming a constant signal in white noise, the coefficients in the linear combination are chosen to minimize the output MSE for several noise distributions. It is shown that the optimal order statistic filter (OSF) tends toward the median filter as the noise becomes more impulsive. The optimal OSF is applied to an actual noisy image and is shown to perform well, combining properties of both the averaging and median filters. A more general design scheme for applications involving nonconstant signals is also given.
Ranked Set Sampling Theory with Order Statistics Background
Ranked set sampling employs judgment ordering to obtain an estimate of a population mean. The method is most useful when the measurement or quantification of an element is difficult but the elements of a set of given size are easily drawn and ranked with reasonable success by judgment. In each set all elements are ranked but only one is quantified. Sufficient sets are processed to yield a specified number of quantified elements and a mean for each rank. The average of these means is an unbiased estimate of the population mean regardless of errors in ranking. Precision relative to random sampling, with the same number of units quantified, depends upon properties of the population and success in ranking. In this paper the ranked set concept is reviewed with particular consideration of errors in judgment ordering.
The Bootstrap Methodology in Statistics of Extremes—Choice of the Optimal Sample Fraction
The main objective of statistics of extremes is the prediction of rare events, and its primary problem has been the estimation of the tail index γ, usually performed on the basis of the largest k order statistics in the sample or on the excesses over a high level u. The question that has been often addressed in practical applications of extreme value theory is the choice of either k or u, and an adaptive estimation of γ. We shall be here mainly interested in the use of the bootstrap methodology to estimate γ adaptively, and although the methods provided may be applied, with adequate modifications, to the general domain of attraction of Gγ, γ ∈ ℝ, we shall here illustrate the methods for heavy right tails, i.e. for γ > 0. Special relevance will be given to the use of an auxiliary statistic that is merely the difference of two estimators with the same functional form as the estimator under study, computed at two different levels. We shall also compare, through Monte Carlo simulation, these bootstrap methodologies with other data-driven choices of the optimal sample fraction available in the literature.
Linear Order Statistic Estimation for the Two-Parameter Weibull and Extreme-Value Distributions from Type II Progressively Censored Samples
Point estimation for the scale and location parameters of the extreme-value (Type I) distribution by linear functions of order statistics from Type II progressively censored samples is investigated. Four types of linear estimators are considered: the best linear unbiased (BLU), an approximation to the BLU, unweighted regression, and a linearized maximum likelihood. Linear transformations of the estimators are also considered for reducing mean square errors. Exact bias, variance, and mean square error comparisons of the estimators are made for several censoring patterns. Since the natural logarithms of Weibull variates have extreme-value distributions, the investigation is applicable to estimation for Weibull distributions.
time series software development information retrieval regression model image retrieval maximum likelihood knowledge base retrieval system model checking distance learning real-time system question answering extreme learning machine learning machine information retrieval system extreme learning order statistic content-based image retrieval temporal logic rate control formal method statistical inference weibull distribution nuclear reactor visual attention image retrieval system question answering system carnegie mellon university binary decision diagram java virtual machine answering system atrial fibrillation carnegie mellon memory network random sequence mellon university extreme programming southeast asia research issue model checker extreme event belief revision visual question answering bounded model checking symbolic model visual question abstract model extreme value theory bounded model symbolic model checking automated storage statistically significant bibliography index arithmetic logic unit model checking technique extreme value distribution model checking algorithm extreme weather south pacific interactive information retrieval sample variance multivariate extreme open-domain question answering model checking based state of knowledge extreme temperature answering question question answering dataset extreme rainfall open-domain question question answering track extreme precipitation daily temperature logic model checking answering track symbolic model checker desired property counterexample-guided abstraction refinement sat-based model checking temperature extreme extreme precipitation event climate extreme formal methods community extreme storm climate event sat-based model precipitation extreme french polynesia image question answering lazy abstraction severe thunderstorm modeling of extreme silo (dataset) pipeline (computing) word list by frequency reactor device component reactor (software) united state