Selection of a Representative Sample

Sometimes a larger dataset needs to be reduced to just a few points, and it is desirable that these points be representative of the whole dataset. If the future uses of these points are not fully specified in advance, standard decision-theoretic approaches will not work. We present here methodology for choosing a small representative sample based on a mixture modeling approach.

[1]  Timothy G. Trucano,et al.  General Concepts for Experimental Validation of ASCI Code Applications , 2002 .

[2]  Robert F. Ling,et al.  Cluster analysis algorithms for data reduction and classification of objects , 1981 .

[3]  Theodore Johnson,et al.  Squashing flat files flatter , 1999, KDD '99.

[4]  P. Müller,et al.  Optimal Bayesian Design by Inhomogeneous Markov Chain Simulation , 2004 .

[5]  Shigeo Abe DrEng Pattern Classification , 2001, Springer London.

[6]  J. Doherty,et al.  Role of the calibration process in reducing model predictive error , 2005 .

[7]  Laura Painton Swiler,et al.  Calibration, validation, and sensitivity analysis: What's what , 2006, Reliab. Eng. Syst. Saf..

[8]  Heng Tao Shen,et al.  Principal Component Analysis , 2009, Encyclopedia of Biometrics.

[9]  Christian Posse,et al.  Likelihood-Based Data Squashing: A Modeling Approach to Instance Construction , 2002, Data Mining and Knowledge Discovery.

[10]  David G. Stork,et al.  Pattern Classification , 1973 .

[11]  Vincent Kanade,et al.  Clustering Algorithms , 2021, Wireless RF Energy Transfer in the Massive IoT Era.

[12]  L. Wasserman,et al.  Practical Bayesian Density Estimation Using Mixtures of Normals , 1997 .

[13]  Art B. Owen,et al.  Data Squashing by Empirical Likelihood , 2004, Data Mining and Knowledge Discovery.

[14]  G. De Soete,et al.  Clustering and Classification , 2019, Data-Driven Science and Engineering.

[15]  J. Kruskal The Relationship between Multidimensional Scaling and Clustering , 1977 .

[16]  Adrian E. Raftery,et al.  Model-Based Clustering, Discriminant Analysis, and Density Estimation , 2002 .

[17]  A. O'Hagan,et al.  Bayesian calibration of computer models , 2001 .