A study of the application of statistical methods for Big data

The use of analysis and classification methods for big data is difficult. Several proposals consist in dividing randomly the population into b sub-samples and aggregating the parameters using an estimator based on the average parameters of these selected sub-samples. This paper aims to find a solution that minimizes calculations by selecting a small number b* sub-samples and keeping the same precision. We can apply this approach to the several method to measure its relevance.