Archetypal shapes based on landmarks and extension to handle missing data

Archetype and archetypoid analysis are extended to shapes. The objective is to find representative shapes. Archetypal shapes are pure (extreme) shapes. We focus on the case where the shape of an object is represented by a configuration matrix of landmarks. As shape space is not a vectorial space, we work in the tangent space, the linearized space about the mean shape. Then, each observation is approximated by a convex combination of actual observations (archetypoids) or archetypes, which are a convex combination of observations in the data set. These tools can contribute to the understanding of shapes, as in the usual multivariate case, since they lie somewhere between clustering and matrix factorization methods. A new simplex visualization tool is also proposed to provide a picture of the archetypal analysis results. We also propose new algorithms for performing archetypal analysis with missing data and its extension to incomplete shapes. A well-known data set is used to illustrate the methodologies developed. The proposed methodology is applied to an apparel design problem in children.

[1]  Giancarlo Ragozini,et al.  Archetypal networks , 2015, 2015 IEEE/ACM International Conference on Advances in Social Networks Analysis and Mining (ASONAM).

[2]  Faculteit Elektrotechniek Sparse principal component analysis Ijle principale componenten analyse , 2010 .

[3]  Michel Verleysen,et al.  Distance estimation in numerical data sets with missing values , 2013, Inf. Sci..

[4]  M. Fréchet Les éléments aléatoires de nature quelconque dans un espace distancié , 1948 .

[5]  Andrea Cardini,et al.  Leaf Morphology, Taxonomy and Geometric Morphometrics: A Simplified Protocol for Beginners , 2011, PloS one.

[6]  B. Hambly Fractals, random shapes, and point fields , 1994 .

[7]  Kathleen M. Robinette,et al.  Sustainable Sizing , 2016, Hum. Factors.

[8]  Amelia Simó,et al.  The $$k$$k-means algorithm for 3D shapes with an application to apparel design , 2016, Adv. Data Anal. Classif..

[9]  Fred L. Bookstein New Statistical Methods for Shape , 1978 .

[10]  Rasmus Larsen,et al.  Sparse principal component analysis in medical shape modeling , 2006, SPIE Medical Imaging.

[11]  T. K. Carne,et al.  Shape and Shape Theory , 1999 .

[12]  Sandra Alemany,et al.  Archetypal analysis: Contributions for estimating boundary cases in multivariate accommodation problem , 2013, Comput. Ind. Eng..

[13]  Robert Tibshirani,et al.  The Elements of Statistical Learning: Data Mining, Inference, and Prediction, 2nd Edition , 2001, Springer Series in Statistics.

[14]  Igor Kononenko,et al.  Multi-document summarization via Archetypal Analysis of the content-graph joint model , 2013, Knowledge and Information Systems.

[15]  Tyler Davis,et al.  Memory for Category Information Is Idealized Through Contrast With Competing Options , 2010, Psychological science.

[16]  Lefteris Angelis,et al.  A novel single-trial methodology for studying brain response variability based on archetypal analysis , 2015, Expert Syst. Appl..

[17]  B. Chan,et al.  Archetypal analysis of galaxy spectra , 2003, astro-ph/0301491.

[18]  Lars Kai Hansen,et al.  Archetypal analysis for machine learning , 2010, 2010 IEEE International Workshop on Machine Learning for Signal Processing.

[19]  C. Ji An Archetypal Analysis on , 2005 .

[20]  Manuel J. A. Eugster,et al.  Performance Profiles based on Archetypal Athletes , 2012 .

[21]  Christian Bauckhage,et al.  Descriptive matrix factorization for sustainability Adopting the principle of opposites , 2011, Data Mining and Knowledge Discovery.

[22]  Rasmus Larsen,et al.  Sparse Decomposition and Modeling of Anatomical Shape Variation , 2007, IEEE Transactions on Medical Imaging.

[23]  D. Kendall SHAPE MANIFOLDS, PROCRUSTEAN METRICS, AND COMPLEX PROJECTIVE SPACES , 1984 .

[24]  F. Rohlf Shape Statistics: Procrustes Superimpositions and Tangent Spaces , 1999 .

[25]  Richard T. Carson,et al.  Archetypal analysis: a new way to segment markets based on extreme individuals , 2003 .

[26]  Morten Mørup,et al.  Archetypal Analysis for Modeling Multisubject fMRI Data , 2016, IEEE Journal of Selected Topics in Signal Processing.

[27]  Guillermo Ayala,et al.  Clustering of spatial point patterns , 2006, Comput. Stat. Data Anal..

[28]  Xavier Pennec,et al.  Intrinsic Statistics on Riemannian Manifolds: Basic Tools for Geometric Measurements , 2006, Journal of Mathematical Imaging and Vision.

[29]  Caleb M. Brown,et al.  Incomplete specimens in geometric morphometric analyses , 2014 .

[30]  Giancarlo Ragozini,et al.  On the use of archetypes as benchmarks , 2008 .

[31]  F. Palumbo,et al.  Archetypal analysis for data‐driven prototype identification , 2017, Stat. Anal. Data Min..

[32]  Manuel J. A. Eugster,et al.  Weighted and robust archetypal analysis , 2011, Comput. Stat. Data Anal..

[33]  Guillermo Ayala,et al.  Apparel sizing using trimmed PAM and OWA operators , 2012, Expert Syst. Appl..

[34]  Sandra Eneh Showroom the Future of Online Fashion Retailing 2.0 : Enhancing the online shopping experience , 2015 .

[35]  Stavros Valsamidis,et al.  Courseware usage archetyping , 2013, PCI '13.

[36]  Donald A. Jackson,et al.  Testing of the effect of missing data estimation and distribution in morphometric multivariate data analyses. , 2012, Systematic biology.

[37]  Manuel J. A. Eugster,et al.  Archetypal Analysis for Nominal Observations , 2016, IEEE Transactions on Pattern Analysis and Machine Intelligence.

[38]  C. Goodall Procrustes methods in the statistical analysis of shape , 1991 .

[39]  C. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[40]  Anuj Srivastava,et al.  Statistical Shape Analysis , 2014, Computer Vision, A Reference Guide.

[41]  Julien Claude,et al.  Morphometrics with R , 2009 .

[42]  F J Rohlf,et al.  On applications of geometric morphometrics to studies of ontogeny and phylogeny. , 1998, Systematic biology.

[43]  Irene Epifanio,et al.  Archetypoid analysis for sports analytics , 2017, Data Mining and Knowledge Discovery.

[44]  Morten Mørup,et al.  Archetypal analysis of diverse Pseudomonas aeruginosa transcriptomes reveals adaptation in cystic fibrosis airways , 2013, BMC Bioinformatics.

[45]  M. R. D’Esposito,et al.  G. RAGOZINI, A new R-ordering procedure to rank multivariate performances , 2008 .

[46]  Ali S. Hadi,et al.  Finding Groups in Data: An Introduction to Chster Analysis , 1991 .

[47]  E. Salas,et al.  Human Factors : The Journal of the Human Factors and Ergonomics Society , 2012 .

[48]  D. Slice Landmark coordinates aligned by procrustes analysis do not lie in Kendall's shape space. , 2001, Systematic biology.

[49]  Manuel J. A. Eugster,et al.  From Spider-man to Hero - archetypal analysis in R , 2009 .

[50]  K. Mardia,et al.  Statistical Shape Analysis , 1998 .

[51]  Ian L. Dryden,et al.  Size and Shape Analysis of Error-Prone Shape Data , 2015, Journal of the American Statistical Association.

[52]  Igor Kononenko,et al.  Weighted archetypal analysis of the multi-element graph for query-focused multi-document summarization , 2014, Expert Syst. Appl..

[53]  Irene Epifanio,et al.  Functional archetype and archetypoid analysis , 2016, Comput. Stat. Data Anal..

[54]  Sandra Alemany,et al.  Archetypoids: A new approach to define representative archetypal data , 2015, Comput. Stat. Data Anal..

[55]  Wei-Yin Loh,et al.  Classification and regression trees , 2011, WIREs Data Mining Knowl. Discov..

[56]  András Zempléni,et al.  Extreme shape analysis , 2006 .

[57]  Guillermo Vinué,et al.  Anthropometry: An R Package for Analysis of Anthropometric Data , 2017 .

[58]  Charles L. Lawson,et al.  Solving least squares problems , 1976, Classics in applied mathematics.

[59]  Sohan Seth,et al.  Probabilistic archetypal analysis , 2013, Machine Learning.

[60]  Trevor Hastie,et al.  The Elements of Statistical Learning , 2001 .

[61]  David F. Midgley,et al.  Marketing strategy in MNC subsidiaries: pure versus hybrid archetypes , 2013 .

[62]  Guillermo Ayala,et al.  Classifying human endothelial cells based on individual granulometric size distributions , 2002, Image Vis. Comput..

[63]  Giancarlo Ragozini,et al.  Interval Archetypes: A New Tool for Interval Data Analysis , 2012, Stat. Anal. Data Min..